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Abstract 

We present a declarative language, W, for the high-level specification of preferences between 
possible solutions (or trajectories) of a planning problem. This novel language allows users to 
elegantly express non-trivial, multi-dimensional preferences and priorities over such prefer- 
ences. The semantics of VP allows the identification of most preferred trajectories for a given 
goal. We also provide an answer set programming implementation of planning problems with 
VP preferences. 
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1 Introduction and Motivation 

Planning — in its classical sense — is the problem of finding a sequence of actions that 
achieves a predefined goal f Reiter 2001|l . Most of the research in AI planning has 
been focused on methodologies and issues related to the development of efficient 
planners. To date, several efficient planning systems have been developed — e.g., see 
| |Long et al. \ . These developments can be attributed to the discovery of good domain- 
independent heuristics, the use of domain-specific knowledge, and the development of 
efficient data structures used in the implementation of planning algorithms. Logic pro- 
gramming has played a significant role in this line of research, providing a declarative 
framework for the encoding of different forms of knowledge and its effective use during 
the planning process IjSon et al. 2005|l . 

However, relatively limited effort has been placed on addressing several important 
aspects in real-world planning domains, such as plan quality and preferences about 
plans. In many real world frameworks, the space of feasible plans to achieve the goal is 
dense, but many of such plans, even if executable, may present undesirable features. In 
these frameworks, it may be simple to find a solution ( "a" plan); rather, the challenge, 
is to produce a solution that is considered satisfactory w.r.t. the needs and preferences 
of the user. Thus, feasible plans may have a measure of quality, and only a subset of 
them may be considered acceptable. These issues can be seen in the following example. 



* This paper is an extended version of a paper that appeared in the proceedings of the 7*'' Interna- 
tional Conference on Logic Programming and Non-Monotonic Reasoning, 2004. 
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Example 1 

Let us consider planning problems in the travel domain. A planning problem in this 
domain can be represented by the following elements^: 

• a set of fluents of the form at(Z), where I denotes a location, such as home, 
school, neighbor, airport, etc.; 

• an initial location U; 

• a final location / f ; and 

• a set of actions of the form method{li, I2) where h and I2 are two distinct loca- 
tions and method is one of the available transportation methods, such as drive, 
walk, ride-train, bus, taxi, fly, bike, etc. In addition, there might be conditions 
that restrict the applicability of actions in certain situations. For example, one 
can ride a taxi only if the taxi has been called, which can be done only if one 
has some money; one can fly from one place to another if he/she has the ticket; 
etc. 

Problems in this domain are often rich in solutions because of the large number 
of actions which can be used in the construction of a plan. Consider, for example, a 
simple situation, in which a user wants to construct a 3-leg trip, that starts from a 
location li and ends at l^, and there are 10 ways to move along each leg, one of them 
being the action walk(li,li^i). The number of possible plans is 10^ and 

walk^li, I2), walk{l2, 13), walk^l^, I4) 

is a possible plan that achieves the goal. In most of the cases, the user is likely to 
dismiss this plan and selects another one for various reasons; among them the total 
distance from li to I4 might be too large, the time and/or energy required to complete 
the plan would be too much, etc. This plan, however, would be a reasonable one, 
and most likely the only acceptable solution, for someone wishing to visit his/her 
neighbors. 

In selecting the plan deemed appropriate for him/herself, the user's preferences play 
an important role. For example, a car-sick person would prefer walking over driving 
whenever the action walk can be used. A wealthy millionaire cannot afford to waste 
too much time and would prefer to use a taxi. A poor student would prefer to bike over 
riding a taxi, simply because he cannot afford the taxi. Yet, the car-sick person will 
have to ride a taxi whenever other transportations are not available; the millionaire 
will have to walk whenever no taxi is available; and the student will have to use a 
taxi when he does not have time. In other words, there are instances where a user's 
preference might not be satisfied and he/she will have to use plans that do not satisfy 
such preference. □ 

The above discussion shows that users' preferences play a decisive role in the choice 
of a plan. It also shows that hard-coding user preferences as a part of the goal is 
not a satisfactory way to deal with preferences. Thus, we need to be able to evalu- 
ate plan components at a finer granularity than simply as consistent or violated. In 
| |Myers and Lee 1999| ), it is argued that users' preferences are of vital importance in 



^ Precise formulae will be presented later. 
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selecting a plan for execution when the planning problem has too many solutions. It 
is worth observing that, with a few exceptions — like the system SIPE-2 with meta- 
theoretic biases ( |Myers and Lee 1999| ) — most planning systems do not allow users to 
specify their preferences and use them in finding plans. The responsibility in selecting 
the most appropriate plan rests solely on the users. It is also important to observe that 
preferences are different from goals in a planning problem: a plan must satisfy the 
goal, while it may or may not satisfy the preferences. The distinction is analogous to 
the separation between hard and soft constraints IjBistarelli et al. 2000|l . For example, 
let us consider a user with the goal of being at the airport who prefers to use a taxi 
over driving his own car; considering his preference as a soft constraint, then the user 
will have to drive his car to the airport if no taxi is available; on the other hand, if 
the preference is considered as a hard constraint, no plan will achieves the user's goal 
when no taxi is available. 

In this paper, we will investigate the problem of integrating user preferences into a 
planner. We will develop a high-level language for the specification of user preferences, 
and then provide a logic programming encoding of the language, based on Answer Set 
Programming l|Niemela 1999|l . As demonstrated in this work, normal logic programs 
with answer set semantics l|Gelfond et al. 1990|l provide a natural and elegant frame- 
work to effectively handle planning with preferences. 

We divide the preferences that a user might have in different categories: 

• Preferences about a state: the user prefers to be in a state s that satisfies a 
property (j) rather than a state s' that does not satisfy it, in case both lead to 
the satisfaction of his/her goal; 

• Preferences about an action: the user prefers to perform the action a, whenever 
it is feasible and it allows the goal to be achieved; 

• Preferences about a trajectory: the user prefers a trajectory that satisfies a cer- 
tain property ip over those that do not satisfy this property; 

• Multi- dimensional Preferences: the user has a set of preferences, with an order- 
ing among them. A trajectory satisfying a more favorable preference is given 
priority over those that satisfy less favorable preferences. 

It is important to observe the difference between <f> and if) in the above definitions, (p 
is a state property, whereas ^ is & formula over the whole trajectory (from the initial 
state to the state that satisfies the given goal). 

The rest of this paper is organized as follows. In Section|21 we review the foundations 
of answer set planning. Section presents the high-level preference language W . 
Section 2] describes a methodology to compute preferred trajectories using answer set 
planning. In Section |S1 we discuss the related work, while Section presents the final 
discussion and conclusions. 

2 Preliminary Answer Set Planning 

In this section we review the basics of planning using logic programming with answer 
set semantics — Answer Set Planning ( or ASP) ( |Dimopoulos et al. 1997|lLifschitz 20021 
ISubrahmanian and Zaniolo 1995|l . We will assume that the effect of actions on the 
world and the relationship between fluents in the world are expressed in an appropri- 
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ate language. In this paper, we make use of the ontologies of a variation of the action 
description language B IjGelfond and Lifschitz 199 8). In this language, an action the- 
ory is defined over two disjoint sets of names — the set of actions A and the set of 
fluents F. An action theory is a pair [D, I), where 

• Z? is a set of propositions expressing the effects of actions, the relationship 
between fluents, and the executability conditions for the actions^; 

• / is a set of propositions representing the initial state of the world. 

Instead of presenting a formal definition of B, we introduce the syntax of the language 
by presenting an action theory representing the travel domain of Example ^ We write 

• at{l), where / is a constant representing a possible location, such as home, air- 
port, school, neighbor, busstation, to denote the fact that the agent^ is at the 
location 

• availablc-car to denote the fact that the car is available for the agent's use; 

• hasJicket{li,l2) to denote the fact that the agent has the ticket to fly from li 
to I2; etc. 

The action of driving from location Zi to location I2 causes the agent to be at the 
location I2 and is represented in B by the following dynamic causal law: 

drive{li, I2) causes at{l2) if atili). 

This action can only be executed if the car is available for the agent's use at the 
location li and there is a road connecting li and Z2. This information is represented 
by an executability condition: 

drive{li, I2) executablejf available_car, at{li), road{li, 12). 

The fact that one can only be at one location at a time is represented by the following 
static causal law (/i 7^ I2): 

-iai(Z2) if at{li). 

Other actions with their executable conditions and effects are represented in a similar 
way. 

To specify the fact that the agent is initially at home, he has some money, and a 
car is available for him to use, we write 

initially {at{home)) 
initially (hasjmoney) 
initially {available jcar{home)) 

Example 2 

^ Executability conditions were not originally included in the definition of the language B in 

jGelfond and Lifschitz 19981. 
^ Throughout the paper, we assume that we are working in a single agent (or user) environment. 

Fluents and actions with variables are shorthand representing the set of their ground instantiations. 



Planning with Preferences using Logic Programming 



5 



Below, we list sonic more actions with their effects and executability conditions, using 
B. 



walk{li, I2) 


causes 




if 


at{li), road{li, I2) 


bus{li,l2) 


causes 




if 


at{li), road{li, I2) 


flight{lij2) 


causes 




if 


at{li), hasJ,icket{li, I2) 


takeJaxi{li, I2) 


causes 




if 


at{li), road{li, I2) 


buy Jticket{l 1,12) 


causes 


hasJ^icket{li, I2) 






callJtaxi{l) 


causes 


available Jaxi{l) 


if 


hasjnoney 


rent-car{l) 


causes 


availablejzar{l) 


if 


hasjinony 


bus{li,l2) 


executableJf 


hasjmoney 






flight{li,l2) 


executableJf 


connected(li, I2) 






takeJaxi{li, I2) 


executableJf 


available Jtaxi{li) 






buyJicketili, I2) 


executableJf 


hasjmoney 







where the Vs denote locations, airports, or bus stations. The fluents and actions are 
self-explanatory. □ 

Since our main concern in this paper is not the language for representing actions 
and their effects, we omit here the detailed definition of the proposed variation of B 
IjGelfond and Lifschitz 1998)l . It suffices for us to remind the readers that the semantics 
of an action theory is given by the notion of state and by a transition function $, that 
specifies the result of the execution of an action a in a state s (denoted by $(a,s)). 
Each state s is a set of fluent literals satisfying the two properties: 

1. for every fluent / G F, either / G s or -1/ G s but {/, -1/} ^ s; and 

2. s satisfies the static causal laws. 

A state s satisfies a fiuent literal / (/ holds in s), denoted by s |= /, if / G s. A state 
s satisfies a static causal law 

/ ifpi, ...,Pn 

if, whenever s \= pi for every 1 < i < n, then we have that s \= f. An action a is 
executable in a state s if there exists an executability condition 

a executableJf pi, . . . ,p„ 

in D such that s ^ pi for every i, < i < n. An action theory {D, I) is consistent if 
1- 'Sq = {/ I initially (/) G /} is a state, and 

2. for every action a and state s such that a is executable in s, we have that 
$(a,s) ^ 0. 

In this paper, we will assume that [D, I) is consistent. A trajectory of an action 
theory {D,I) is a sequence sooisi ■ • ■ o-nSn where s^'s are states, a^'s are actions, and 
Si+i G <i>(si, ai+i) for I G {0, . . . , n - 1}. 

A planning problem is specified by a triple {D,I,G), where {D,I) is an action 
theory and G is a fluent formula (a propositional formula constructed from fluent 
literals and propositional connectives) representing the goal. A possible solution to 
{D, /, G) is a trajectory a — sqUiSi . . . OmSm, where sq \= I and 3„i |= G. In this case, 
we say that the trajectory a achieves G. 
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Answer set planning ( |Dimopoulos et al. 1997|rLifschitz 2002IISubrahmanian and Zaniolo 1995|l 
solves a planning problem (Z), J, G) by translating it into a logic program n(D, /, G) 
which consists of (i) rules describing _D, /, and G; and (ii) rules generating action 
occurrences. It also has a parameter, lengthy declaring the maximal length of the 
trajectory that the user can accept. The two key predicates of n(Z?, /, G) are: 

• holds{f,t) - the fluent literal / holds at the time moment t; and 

• occ{a, t) - the action a occurs at the time moment t. 

holds{f,t) can be extended to define holds{(/),t) for an arbitrary fluent formula ip, 
which states that </) holds at the time moment t. Details about the program Il{D, I, G) 
can be found in llSon et al. 2005|l '^. The key property of the translation of {D,I,G) 
into n(I?, /, G) is that it ensures that each trajectory achieving G corresponds to an 
answer set of n(D, /, G), and each answer set of n(D, /, G) corresponds to a trajectory 
achieving G. 

Theorem 1 

l|Son et al. 20n5|l For a planning problem {D,I,G) with a consistent action theory 
{D,I) and maximal plan length n, 

1. if sofli . . .a„s„ is a trajectory achieving G, then there exists an answer set M 
of U{D, I, G) such that: 

(a) occ{ai, i — 1) € M for i G {1, . . . , n}, and 

(b) = {/ I holdsif, i) G M} for i e {0, . . . , n}. 

2. if M is an answer set of Il{D,I,G), then there exists an integer < fc < n 
such that sofli . . . akSk is a trajectory achieving G, where occ{ai, i — 1) G M for 
1 < i < fc and = {/ | holds{f, i) G M] for i G {0, . . . , fc}. 

In the rest of this work, if M is an answer set of 11(1),/, G), then we will de- 
note with aM the trajectory achieving G represented by M . Answer sets of the 
program n(D,/, G) can be computed using answer set solvers such as smodels 
(Simons et al. 200211 . dlv (Leone et al. 200_5J, cmodels (Licrlcr and Maratea 2004jl . 
ASSAT (|Lin and Zhao 2002|l . and jsmodels l|Le and Pontelh 2003|l . 

3 A Language for Planning Preferences Specification 

In this section, we introduce the language W for planning preferences specification. 
This language allows users to express their preferences among plans that achieve the 
same goal. We subdivide preferences in different classes: basic desires, atomic pref- 
erences, and general preferences. Intuitively, a basic desire is a preference expressing 
a desirable property of a plan such as the use of certain action over the others, the 
satisfaction of a fluent formula, or a temporal property fSubsection 13. 1|) . An atomic 
preference describes a one-dimensional ordering on plans and allows us to describe a 
ranking over the plans given a set of possibly conflicting preferences fSubsection 13 . 2|l . 



* A Prolog program for translation {D,I,G) into I1{D, I , G) can be found at 
http: //www. cs .lunsu. 6du/~tson/ASPlan/Pref erences/translate . pi 
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Finally, a general preference provides means for users to combine different preference 
dimensions fSubsection I3.3|l . 

Let {D, I, G) be a planning problem with the set of actions A and the set of fluents 
F; let !Fp be the set of all fluent formulae over F. The language PV is defined as 
special formulae over A and F. We will illustrate the different types of preferences 
using the action theory representing the travel domain discussed earlier (Example E)). 
User preferences about plans in this domain are often based on properties of actions. 
Some of these properties are flying is very fast but very expensive; walking is slow, 
and very tiring if the distance between the two locations is large but cheap; driving is 
tiring and costs a little but it is cheaper than flying and faster than walking; etc. 

3.1 Basic Desires 

A basic desire is a formula expressing a single preference about a trajectory. Consider 
a user who is at home and wants to go to school (goal) spending as little money as 
possible (preference), i.e., his desire is to save money. He has only three alternatives: 
walking, driving, or take_taxi. Walking is the cheapest and riding a taxi is the most 
expensive. Thus, a preferred trajectory for him should contain the action walk(.,.). 
This preference could also be expressed by a formula that forbids the fluent avail- 
able_taxi(home) or available_car to become true in every state of the trajectory, thus 
preventing him to drive or take a taxi to school. These two alternatives of preference 
representation are not always equivalent. The first one represents the desire of leaving 
a state using a specific group of actions, while the second one represents the desire of 
being in certain states. 

Basic desires are constructed by using state desires and/or goal preferences. Intu- 
itively, a state desire describes a basic user preference to be considered in the context 
of a specific state. A state desire (/? (where (/? is a fluent formula) implies that we prefer 
a state s such that s \= ip. K state desire occ{a) implies that we prefer to leave state 
s using the action a. In many cases, it is also desirable to talk about the final state of 
the trajectory — we call this a goal preference. These cases are formally defined next. 

Definition 1 {State Desires and Goal Preferences) 

A (primitive) state desire is either a formula Lp, where £ J-p, or a formula of the 
form occ{a), where a G A. 

A goal preference is a formula of the form goal(iy9), where is a, formula in J- p. 

We are now ready to define a basic desire that expresses a user preference over the 
trajectory. As such, in addition to the propositional connectives A, V, we will also 
use the temporal connectives next, always, until, and eventually. 

Definition 2 [Basic Desire Formula) 

A basic desire formula is a formula satisfying one of the following conditions: 

• a goal preference is a basic desire formula; 

• a state desire (/? is a basic desire formula; 

• given the basic desire formulae ipi,Lp2, then Lpi A ip2, ^fii V (p2, ""Pi, next((pi), 
until(i^i, (^2), always(93i), and eventually((y3) are also basic desire formulae. 

For example, to express the fact that a user would like to take the taxi or the bus to 
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go to school, we can write: 

eventually( occ{bus{home, school)) V occ{taxi{home, school)) ). 

If the user's desire is not to call a taxi, we can write 

always( -iocc{call_taxi{home)) ). 

If for some reasons, the user's desire is not to see any taxi around his home, the desire 

always( -^availahleJtaxi(home) ). 

can be used. Note that these encodings have different consequences — the second pre- 
vents taxis to be present independently from whether it was called or not. 

The definition above is used to develop formulae expressing a desire regarding the 
structure of trajectories. In the next definition, we will describe when a trajectory 
satisfies a basic desire formula. In a later section (Section 0J, we will present logic 
programming rules that can be added to the program n(£',J, G) to compute tra- 
jectories that satisfy a basic desire. In the following definitions, given a trajectory 
a — soOiSi ■ ■ ■ ObnSm the notation a[{\ denotes the trajectory Sia.iJ^iSi+i ■ ■ ■ a„s„. 

Definition 3 [Basic Desire Satisfaction) 

Let a — SQaiSia2S2 ■ ■ -UnSn be a trajectory, and let ip he a basic desire formula, a 
satisfies ip (written as a \= ip) iff one of the following holds 



• 




= goal(ip) and s„ ^ V 




• 




— tjj E Tp and sq ^ V' 




• 




= occ(a), fli = a, and n > 1 




• 




= i/ii A -02, a H V"! E^nd a 1= ■02 




• 




= 01 V 02, a h V"! or a h V'2 




• 




= -^if} and a ^ ^ 




• 




— next(-0), a[l] ^ 0, and n > I 




• 




= always (0) and V(0 < z < rt) we have that q;[z] 


h0 


• 




— eventually(-0) and 3(0 <i<n) such that 




• 




— until(-0i, ?/;2) and 3(0 < i < n) such that a 


[j] 1= -01 for all < j < i and 



a[i] 1= ip2- 

Definition 13 allows us to check whether a trajectory satisfies a basic desire. This will 
also allow us to compare trajectories. Let us start with the simplest form of trajectory 
preference, involving a single desire. 

Definition 4 {Ordering Between Trajectories w.r.t. Single Desire) 

Let if he a basic desire formula and let a and /3 be two trajectories. The trajectory a 

is preferred to the trajectory /3 (denoted as a /?) if a |= </? and (3 ^ p- 

We say that a and (3 are indistinguishable w.r.t. tp (denoted as a (3) if one of 
the two following cases occur: 

1. a\= ip and (3 \= ^p^ ov 

2. a ^ (y9 and (i ^ ip. 

Whenever it is clear from the context, we will omit ip from and w^. We will also 
allow a weak form of single preference, described next. 
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Definition 5 ( Weak Single Desire Preference) 

Let be a basic desire formula and let a, be two trajectories, a is weakly preferred 
to /3 (denoted a (3) iS a (3 or a w,^ /3. 

It is easy to see that is an equivalence relation over the set of trajectories. 
Proposition 1 

Given a basic desire the relation is an equivalence relation. 
Proof 

1. Reflexivity: this case is obvious. 

2. Symmetry: let us assume that, given two trajectories a, /? we have that a k,^ /3. This 
implies that either both trajectories satisfy ip or neither of them do. Obviously, if 
a \= 1^ and \= [a Y= and (3 (p) then we have also that (3 \= if and a \= if 

{(3 ^ If and a ^ (p), which leads to /? a. 

3. Transitivity: let us assume that for the trajectories a, f3, 7 we have that 

a /3 and P ^<p 1 

From the first component, wc have two possible cases: 

(a) a \= (p and (3 \= ip. Since /3 rs^ 7, we need to have j \^ ip, which leads to a «^ 7. 

(b) a ^ (fi and /3 ^ (p. This second component, together with /3 7 leads to 
7 ^ y, and thus a rs,^ 7. 

□ 

In the next proposition, we will show that is a partial order over the set of 
equivalence classes representatives of 

Proposition 2 

The relation defines a partial order over the set of representatives of the equivalence 
classes of 

Proof 

Let us prove the three properties. 

1. Reflexivity: consider a representative a. Since either a \= ip oi a ip, we have that 
a -<^p a. 

2. Anti- symmetry: consider two representatives a, /3 and let us assume that a (3 and 
(3 ■<tp a. Since both a and (3 are equivalent class representatives of W;^, to prove this 
property, it suffices to show that a (3. First of all, we can observe that from a (3 
we have either a (3 ov a k,^ jS. If a jS then this means that a\= ip and jS ^ ip. 
But this would imply that /3 j^,^ a. Then we must have that a (3. 

3. Transitivity: consider three representatives ai, 02, as and let us assume 

Oil diif Oi2 A a2 <^ as 

Let us consider two cases. 

^ This means that -<,p satisfies the following three properties: (i) Refiexivity: a. a; (ii) Antisym- 
metry: if a P and /? -<^, a then a fs^ P; and (iii) Transitivity: if a (3 and /? -<^, 7 then 
a -<ip 7 where a, ,3, and 7 are arbitrary trajectories. 
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• ai -<ip a-i. This implies that ai ^ (p and a2 ^ ^- Because a2 r<i^ as, we have 
that as ^ (f. This, together with ai \= ip, allows us to conclude ai as- 

• ai P^^p a2, then either ai\^ and a2 |= (/3, or ai ^ and a2 In the first 
case, we have ai as if as \= ^ and ai as if as ^ y^, i.e., ai as. 
If instead we have the second possibility, then since p> and a2 :<,p as, we 
must have as ^ (^s. This allows us to conclude that ai «^ as and thus ai as. 



We next define the notion of most preferred trajectories. 
Definition 6 [Most Preferred Trajectory w.r.t. Single Desire) 

Let (/5 be a basic desire formula. A trajectory a is said to be a most preferred trajectory 
w.r.t. ip if there is no trajectory /? such that (3 a. 

Note that in the presence of preferences, a most preferred trajectory might require 
extra actions that would have been otherwise considered unnecessary. 



Let us enrich the action theory of Example [21 with an action called huy_coffee, which 
allows one to have coffee, i.e, the fluent has_coffee becomes true. The coffee is not free, 
i.e., the agent will have to pay some money if he buys coffee. This action can only be 
executed at the nearby Starbucks shop. If our agent wants to be at school and prefers 
to have coffee, we write: 



Any plan satisfying this preference requires the agent to stop at the Starbucks shop 
before going to school. E.g., while sowalk{home, school)si, where sq and si denote 
the initial state (the agent is at home) and the final state (the agent is at school), 
respectively, is a valid trajectory for the agent to achieve his goal, this is not a most 
preferred trajectory; instead, the agent has to go to the Starbucks shop, buy the coffee, 
and then go to school. Besides the action of buy_coffee that is needed for him to get 
the coffee, the most preferred trajectory requires the action of going to the coffee shop, 
which is not necessary if he does not have the preference of having the coffee. 

Observe that the most preferred trajectory contains the action buy_coffee, which 
can only be executed when the agent has some money. As such, if the agent does 
not have any money, this action will not be executable and no trajectory achieving 
the goal of being at the school will contain this action. This means that no plan can 
satisfy the agent's preference, i.e., he will have to go to school without coffee. □ 

The above definitions are also expressive enough to describe a significant portion 
of preferences that frequently occur in real- world domains. Since some of them are 
particularly important, we will introduce some syntactic sugar to simplify their use: 

• (Strong Desire) given the basic desire formulae Pi,ip2i Pi < ^2 denotes ipiA^p2- 

• (Weak Desire) given the basic desire formulae pi,(p2, ipi <™ ip2 denotes pi\/^p2- 

• (Enabled Desire) given two actions ai,a2, we will denote with ai 02 the 
formula {executable{ai) A executable{a2)) ^ {occ{ai) < 000(02)) where 



□ 



Example 3 



goal ( has_ coffee) . 



executable{a) 





a 



executablejf 



pi,---,Pk 
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In the rest of the paper, wc often use the following shorthands: 

• For a sequence of preference formulae ipi, . . . , tpk, 

(fii < ... <(pk 

stands for 

/\ ((p, < ipi+i). 
ie{i,...,fe-i} 

• For a sequence of preference formulae tpi,. . . ,ipk, 

(^1 < . . . < <^fe 

stands for 

iG{l,...,fc-l} 

• For the sequence of actions ai, . . . , aj., 61, . . . , 

(ai V . . . V ttfe) (61 V . . . V hrn) 

is a shorthand for 

A ^i)- 

ie{i,...,fe}, je{i,...,m} 

• For actions with parameters like drive or walk, we sometime write drive wa/fc 
to denote the preference 

\J {drive{li,l2) walk{h,l2)). 

where 5 is a set of pre-defined locations. Intuitively, this preference states that 
we prefer to drive rather than to walk between locations belonging to the set S. 
For example, if 5 = {home, school} then this preference says that we prefer to 
drive from home to school and vice versa. 
We can prove the following simple property of <^ . 

Lemma 1 

Consider the set of basic desire formulae and let us interpret as a relation. This 
relation is transitive. 

Proof 

Let <^2 and <^2 <^ <^3- But these are the same as 

which implies 

(^1 V -ii^a 

and thus 1^3. □ 

3.2 Atomic Preferences 



Basic desires allow the users to specify their preferences and can be used in selecting 
trajectories which satisfy them. Prom the definition of a basic desire formula, we can 
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assume that users always have a set of desire formulae and that their intention is to 
find a trajectory that satisfies all such formulae. In many cases, this proves to be too 
strong and results in situations where no preferred trajectory can be found. Consider 
again the preference in Example |21 it is obvious that the user cannot have a plan that 
costs him nothing and yet satisfies his preferences. In the travel domain, time and cost 
are two criteria that a user might have when making a travel plan. These two criteria 
are often in conflict, i.e., a transportation method that takes little time often costs 
more. As such, it is very unlikely that the user can get a plan that costs very little 
and takes very little time. 

Example 4 

Let us continue with our travel domain. Again, let us assume that the agent is at 
home and he wants to go to school. To simplify the representation, we will write bus, 
taxi, drive, and walk to say that the agent takes the bus, taxi, drive, or walk to 
school, respectively. The following is a desire expressing that the agent prefers to get 
the fastest possible way to go to school (assume that both driving and taking the bus 
require about the same amount of time to reach the school): 

time = alwa.ys{taxi [drive V bus) walk) 

On the other hand, when the agent is not in a hurry, he/she prefers to get the cheaper 
way to go to school (assume that driving and taking the bus cost about the same 
amount of money) : 

cost = ahivays(walk (drive V bus) taxi) 

Here, the preference states that the agent prefers to execute first the action that 
consumes the least amount of money. □ 

It is easy to see that any trajectories satisfying the preference time will not satisfy the 
preference cost and vice versa. This discussion shows that it is necessary to provide 
users with a simple way to rank their basic desires. To address the problem, we allow a 
new type of formulae, called atomic preferences, which represents an ordering between 
basic desire formulae. 

Definition 7 (Atomic Preference) 

An atomic preference formula is defined as a formula of the type tpi <i ip2 <i ■ ■ ■ <i ifin 
where . . . , cp„ are basic desire formulae. 

The intuition behind an atomic preference is to provide an ordering between different 
desires — i.e., it indicates that trajectories that satisfy (pi are preferable to those that 
satisfy (fi2, etc. Clearly, basic desire formulae are special cases of atomic preferences — 
where all preference formulae have the same rank. The definitions of « and -< can 
now be extended to compare trajectories w.r.t. atomic preferences. 

Definition 8 [Ordering Between Trajectories w.r.t. Atomic Preferences) 

Let a, (3 be two trajectories, and let — (^i <l (^2 <1 • • • <1 V^n be an atomic preference 

formula. 

• a, P are indistinguishable w.r.t. 5* (written as a /3) if 

\fi. [ l<i<n ^aKi^^ (3) . 
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• a is preferred to [3 w.r.t. 5" (written as a (3) if 3(1 < i < n) such that 

1- V(l < j < i) we have that a k,^. (3, and 
2. a p. 

We win say that a (3 if either a (3 oi a j3. 

It is easy to see that is an equivalence relation on the set of trajectories. The 
following proposition is similar to Proposition [21 

Proposition 3 

For an atomic preference 5", is a partial order over the set of representatives of 
the equivalence classes of 

Proof 

Let us analyze the three properties. 

• Refiexivity: Consider a representative a. By Definition |H1 a a, which leads to 
a ^ij, a. 

• Anti- symmetry: Let a /3 and /3 a. Again, it is enough if we can show that 
a sa^ p. Let us assume, by contradiction, that a /3. This means that there is 
a value of i such that, for all 1 < j < i we have that a ^^p- (3 and a (3. But 
this implies that /3 a for j < i and /3 7^1^. a, which ultimately means /? a, 
contradicting the initial assumptions. 

• Transitivity: let ai, a2, 013 be three representatives such that 

ai a2 A 0^2 03 
Let us consider the possible cases arising from the first component: 

— ai a2- This means that ai Ki^. ai for all 1 < j < We have two sub-cases: 

— a2 ~* 03. Because is an equivalence relation, we have that a\ a3, 
which implies that a\ 03. 

— ck2 ck3- This means that there exists «, 1 < i < tt., such that ai ^lp^. as for 
all 1 < fc < I and ai -<ipi a^. Since sa^^. is an equivalence relation, we have 
that ai as for all 1 < j < Furthermore, ai ~ip. ai and ai -<^. a^ 
imply that ai \= ipi, ai \= ipi, and a^ y= Lpi. Thus, a\ 03. Hence, 
"1 ^* as- 

— a\ a2. This implies that there exists «, 1 < « < n, such that a\ ~ip^ ai for 
all 1 < fc < i and ai -<ip- ai. Again, we have two sub-cases: 

— ai Riif a^. This means that 012 ^^pj for all j, 1 < j < n. So, we have that 
cti ~ipj ot^ for all 1 < j < i, since ~ip. is an equivalence relation. Similar 
to the above case, we can show that ai -<^^ ai and ai a3 implies that 
cti -<>pi 03. Thus, ai -<q, a-i- 

— ai a3. This means that there exists j, 1 < j < such that ai 

for all 1 < /c < j and ai ^ip. a^. If j > i, we have that ai w^^. as for k < i 
and ai as (because ai a2 and a2 as). Otherwise, if j < i, 
using this fact and the transitivity of w^^. , we can conclude that ai as 
for all 1 < fc < J and ai as, which implies that ai -<ip as. 

□ 
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Definition 9 {Most Preferred Trajectory w.r.t. Atomic Preferences) 

A trajectory a is most preferred if there is no other trajectory that is preferred to a. 

Example 5 

Let us continue the Example 01 The two preferences time and cost can be combined 
into different atomic preferences, e.g., 

time <l cost or cost < time. 

The first one is more appropriate for the agent when he is in a hurry while the second 
one is more appropriate for him when he has time. The trajectory 

a = So walk{home, school) si 

is preferred to the trajectory 

(3 — So callJtaxi{home) s'l taxi{home, school) s'2 

w.r.t. to the preference cost <\ time, i.e., a -<cost<itime P-^ On the other hand, we have 
that /3 ^ume<ico8t ex. □ 

3.3 General Preferences 

Atomic preferences allow users to list their preferences according to their importance, 
where more preferred desires appear before less preferred ones. Naturally, when a user 
has a set of atomic preferences, there is a need for combining them to create a new 
preference that can be used to select the best possible trajectory suitable to him/her. 
This can be seen in the next example. 

Example 6 

Let us continue with the action theory described in Example ^ Besides time and 
cost, agents often have their preferences based on the level of comfort and/or safety 
of the the available transportation methods. This preferences can be represented by 
the formulae 

comfort — always{f light {drive V hus) walk) 

and 

safety — always{walk flight [drive V bus)). 

Now, consider an agent who has in mind the four basic desires time, cost, comfort, 
and safety. He can rank these preferences and create different atomic preferences, i.e., 
different orders among these preferences. Let us assume that he has combined these 
four desires and produced the following two atomic preferences 

VPi = comfort <] safety and ^'2 = cost <1 time. 

Intuitively, ^'i is a comparison between level of comfort and safety, while ^'2 is a 
comparison between affordability and duration. 

Suppose that the agent would like to combine and ^'2 to create a preference 

® For brevity, we omit the description of the states Si's. 
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stating that he prefers trajectories that are as comfortable as possible and cost as little 
as possible. So far, the only possible way for him to combine these two preferences is to 
concatenate them in a certain order and view the result as a new atomic preference. 
However, neither 'i'l <] ^'2 nor ^'2 < ^'i meets the desired criteria — as they simply 
state that is more relevant than \l/2 (or vice versa). The only way to accomplish 
the desired effect is to decompose ^1 and ^'2 and rebuild a more complex atomic 
preference. This shows that the agent might have to define a new atomic preference 
for his newly introduced preference. □ 

The above discussion shows that it is necessary to provide additional ways for com- 
bining atomic preferences. This is the topic of this sub-section. We will introduce 
general preferences, which can be used to describe a multi-dimensional order among 
preferences. Formally, we define general preferences as follows. 

Definition 10 ( General Preferences) 

A general preference formula is a formula satisfying one of the following conditions: 

• An atomic preference ^ is a general preference; 

• If \I'2 are general preferences, then \E'i&\l/2, \t'i | \E'2, and ! are general prefer- 
ences; 

• If ^fi, ^'2, . . . , ^tfe is a collection of general preferences, then < 'J'2 < • • • < ^fe is a 

general preference. 

In the above definition, the operators &, |, ! are used to express different ways to com- 
bine preferences. Syntactically, they are similar to the operations A, V, -> employed in 
the construction of basic desire formulae. Semantically, they differ from the opera- 
tions A, V, -I in a subtle way. Intuitively, the constituents of a general preference are 
atomic preferences; and a general preference provides a means for combining different 
orderings among trajectories created by its constituents, thus providing a means for 
the selection of a most preferred trajectory. Let us consider the case where a general 
preference has two constituents and ^'2. As we will see later, each preference will 
induce two relations on trajectories, the indistinguishable and preferred relations, as 
in the case of atomic preferences. In other words, can be represented by this pair 
of relations. Given a general preference ^i, let Oi and denote the set of pairs of 
trajectories (a,/3) such that a /3 and a w^-^ /3, respectively. The operators &, |, ! 
look at this characterization and define two relations among trajectories that satisfy 
the following equations: 

• For the formula ^'i&^'2, we define 

def 

(oi,ei)&(o2,e2) = ((oi n 02), (ei n 62)). 

This says that the relation representing the ordering between trajectories in- 
duced by & is created as the intersection of the relations representing the or- 
derings between trajectories induced by its components. In this case, a most 
preferred trajectory is the one which is most preferred w.r.t. every component 
of the formula. 

• For the formula '^i \ ^2, we define 

def 

(01, ei) I (02, 62) = ((01 n 02) u (01 n 62) u (ei n 02), (ei n 62)). 
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Here, the relation induced by | guarantees that the most preferred trajectory is 
the one which is most preferred w.r.t. at least one component of the formula 
and it is indistinguishable from others w.r.t. the remaining component of the 
formula. 

• For the formula we define 

!(oi,ei) =^((oi"^ Uei),ei) 

which basically reverses the relations induced hy ^i. 

This is made more precise in the following definition. 

Definition 11 {Ordering Between Trajectories w.r.t. General Preferences) 
Let ^' be a general preference and let a, /? be two trajectories. We say that 

• a is preferred to (3, denoted by a (3, if: 

1. ^E" is an atomic preference and a /3 

2. * = *i&4'2 and a 13 and a (3 

3. ^ = {"^2 and: 

(a) a (3 and a f«*2 /3; or 

(b) a /3 and a /3; or 

(c) a ^q,^ (3 and a -<*2 j3 

4. * = !*i and 13 a 

5. * = < • • • < and there exists l<i<k such that: (i) V(l < j < i) 
we have that a w^^. /?, and (ii) a -<^^ (3. 

• a is indistinguishable from (3, denoted by a (3, if: 

1. ^ is an atomic preference and a (3. 

2. ^ = *i&*2, Oi f«*i 13, a (3. 

3. * = *i I *2, a j3, and a ps*^ (3. 

4. * = !*i and a 13. 

5. ^ = ^'i < • • • < vj/j,^ and for all 1 < i < fc we have that a (3. 

Similar as above, a j3 indicates that either a (3 or a (3. Before we move 
on, let us observe that the formula < . . . < where each is a basic desire, can 
be viewed as both an atomic preference as well as a general preference. This is not 
ambiguous since the semantic definition for the two cases coincide. It is easy to see 
that the following holds for 

Lemma 2 

For every pair of trajectories a and /? and a general preference ^' such that a /3, 

- if = then a /3 and a 

- if v[r = \[r^ I vI/2 then a /? and a 

- if * = !*! then (3 a. 

We can also show that for every general preference «^ is an equivalence rela- 
tion over trajectories and <<i, is a partial order on the set of representatives of the 
equivalence classes of rs^. To prove this property, we need the following lemma. 
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Lemma 3 

Let be a general preference formula and let a, (3, and 7 be three trajectories. It 
holds that 

• if a (3 and /3 7 then a 7; and 

• if a -(^ j3 and a 7 then 7 /?. 

Proo/ 

Let us prove the result by structural induction on . Because the proof of the two 
items are almost the same, we present below the proof of the first item. 

• Base: If ^E* is an atomic preference, and — ipi<\- ■ --ilLpk then from a P (Definition 
IHJ we obtain that there exists 1 < i < k such that 

— a '^tpj P for all 1 < j < i and 

— a <ip^ (3. 

Furthermore, from (3 ~* 7 we have that (3 k.^^. 7 for all 1 < j < k. Since w^p. are 
all equivalence relations, we obtain that a k.^^. 7 for all 1 < j < furthermore, since 
a ^^p. then a \= ipi and P ^ ipi. Since P ~ip- 7 then necessarily we have also that 
7 ^ </3i. This allows us to conclude that a 7. 

• Inductive Step: 

1. ^E* = ^'i&5'2. Since a P, we have a ^,5^ P and a -^^^ p. Furthermore, 
P 7 implies P 7 and P 7- From the induction hypothesis we have 
a J 7 and a -(ij^ 7' which leads to a 7. 

2. ^E* = I ^['2. From a P we have three possible cases: 

(a) a ^ijfj P and a -^^3 /3. In this case, since P 7 implies /3 7 and 
/3 Wipj 7, we have that a ^ipj 7 and a ^ipj 7. This implies that a 7. 

(b) a /3 and a w^fj /3. From P 7 we obtain P 7 and /3 7. Since 
Wiji^ is an equivalence relation, we can infer a 7. Furthermore, from 
the induction hypothesis we obtain a -<3fj 7. These two conclusions lead to 
a -<3, 7. 

(c) a -<,52 P and a w^pj^ This case is symmetrical to the previous one. 

3. 4* ^l^^i. From a P we obtain P -<^j a. Since P 7 implies P 7, from 
the induction hypothesis we can conclude 7 -<.pj a and ultimately a -<^f, 7. 

4. 'J = 'I'l < • • • < 'I'/j. From the definition of a P we have that there exists 
1 < i < k such that 

(a) a ss>p^ P for all 1 < j < i and 

(b) a ^^s'i P 

Furthermore, from P 7 we know that /3 7 for all 1 < j < Since 
are all equivalence relations, we have that a 7 for all 1 < J < «. 

Furthermore, from the induction hypothesis, from a -<ip. P and P ^^f,. 7 we can 
conclude a ~<q,. 7. This finally leads to a 7. 



□ 
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Lemma 4 

Let be a general preference formula and a, /?, and 7 be three trajectories. It holds 
that if a /3 and /3 <^ 7 then a 7. 

Proof 

Let us prove the result by structural induction on ^P. 

• Base: If ^E* is an atomic preference, and = ipi<i- ■ ■<iipk then from a /3 (Definition 
IHJ we obtain that there exists 1 < i < k such that 

— a '^tpj P for all 1 < j < i and 

— a /3. 

Furthermore, from (3 7, we know that there exists 1 < / < such that 

— /3 7 for all 1 < j < Z and 

If i < it is easy to see that a ^^p- 7 for j < i and a 7, which implies that 
a 7. If / < i holds, we have that a w^^. 7 for all 1 < j < ^ and a 7. Thus, 
a -<>p 7. 

• Inductive Step: 

— = ^'i&\E'2. Since a /3 then we have a /3 and a ~<<i,.^ fi. Furthermore, 
(3 -<^i 7 implies (3 7 and j3 -<*2 7. From the induction hypothesis we have 
a -(3(1 7 and a -(q,^ 7, which leads to a 7. 

— 5" = I 'I'2. From a -<if /3 we have three possible cases: 

1. a (3 and a -<^!^ (3. The proof for this case is similar to the case 

In this case, since (3 7 implies [3 ^^j^ 7 and f3 7, then we have 
a -<>i'i 7 and a -<<i,^ 7. This implies that a 7. 

2. a -<>pj^ /? and a fs^^ From /3 7 we obtain /? 7 and /3 «>i.2 7. Since 
«ii(2 is an equivalence relation, we can infer a Wipj 7. Furthermore, from 
the induction hypothesis we obtain a -<>irj 7. These two conclusions lead to 
a -<>p 7. 

3. a -<ip2 /? and a Wi^j^ /?. This case is symmetrical to the previous one. 

— =\^i. From a (3 we obtain [3 a. Since /3 7 implies fSifj 7, from 
the induction hypothesis we can conclude 7 -<.pj a and ultimately a 7. 

— 'f = 'J'l <•••<] '5 /j. From the definition of a (3 we have that there exists 
1 <i <k such that 

— a ~<si- [3 for all 1 < j < i and 

— a f3 

Furthermore, from P 7 we know that (3 7 for all 1 < j < ^- Since all 
Ki^^ are equivalence relations, then we have that a 7 for all 1 < i < «. 
Furthermore, from the induction hypothesis, from a -<ip. /? and /? 7 we can 
conclude a -<3r; 7. This finally leads to a 7. 

□ 

We now show that is a partial order on the set of representatives of the equiv- 
alence classes of w*. 
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Proposition 4 

Let \1/ be a general preference. Then is a partial order on the set of representatives 
of the equivalence classes of ~^SJ. 

Proof 

We need to show that is reflective, antisymmetric, and transitive. Reflexivity 
follows from the fact that ~$ is an equivalence relation and thus a a holds for 
every a. We prove that is antisymmetric and transitive using structural induction 
on 

• Base: This corresponds to ^' being an atomic preference. The two properties follow 
from Proposition 13 

• Inductive step: Let us consider the possible cases for 

1. * = ^'i&5'2. 

(a) Anti- symmetry: consider two representatives a,/3 and let us assume that 
a f3 and f3 a. Again, it is enough if we can show that a f3. 
a P implies that a ^ip^ P and a ^i^^ /? (Lemma|2l. ^* implies that 
/3 a and /3 ^ip^ a (Lemma |21l. By the induction hypothesis, we have 
that a w^j^ P and a fv^^ p which imply that a fs^r p. 

(b) Transitivity: consider three representatives ai, a2, and with ai ck2 
and a2 di^ as- The first relationship implies that ai ^ipj a2 and ai Q^2- 
The second relationship implies that a2 as, and a2 ^3- From the 
induction hypothesis we have ai and ai Q^s- Thus, ai aa. 

2. ^E* = I ^2- Similar arguments to the above case (with the help of Lemma 
12} allows us to conclude that anti-symmetry and transitivity also holds for this 
case. 

3. * = l^-i. 

(a) Anti- symmetry: consider two representatives a,P and let us assume that 
a :<qf P and P a. Lemma 12 imply that P ^,5^ a and a p. By the 
induction hypothesis, we have that a p. This allows us to conclude that 
a ~\s, p. 

(b) Transitivity: consider three representatives ai, 02, and as with ai ^.p a2 
and a2 as. Again, Lemma |21 implies that 02 ai and as ^,5^ 0:2. 
The induction hypothesis implies that as ^i^^ ai, and hence, ai 013. 

(a) Anti- symmetry: consider two representatives a, P with a /? and /3 a. 
Assume that a /?. This means that there exists 1 < i < fc such that 
a P for all 1 < J < I and a p. This will imply that P a cannot 
hold, i.e., we have a contradiction. This means that a p. 

(b) Transitivity: consider three representatives ai, a2, and as with ai 02 
and a2 as. We have four sub-cases: 

i ai a2 and a2 ~* as. In this case, we have that ai «^ because 

is an equivalence relation. 

ii oi a2 and a2 ~* cks- Lemma |21 implies that ai £13. 

iii ai «ip a2 and 02 03. Lemma 01 implies that ai 03. 
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iv ai a2 and a2 03. This implies that there exists 1 < «i,i2 < ^ 
such that ai r:!^^. 02 for aU 1 < j < zi and ai -<*i^ 02; and a2 as 
for aU 1 < j < 12 and 02 -<*i2 as- 

If ii < 12 then from the fact that k.^^. are equivalence relations, we can 
conclude that ai for all 1 < j < ii. Furthermore, by Lemma |21 

from ai -^^ 02 and 02 as, we can conclude that ai -<\ii-_^ a^. This 
leads to ai as. 

Similarly, if ii > 12, we have that ai 0:3 for all 1 < J < «2 and 

^*i2 '^3' This leads to ai -<>f 0:3. 
Finally, if ii =12, then we have that ai 03 for all I < j < ii- Fur- 
thermore, since ai a2 and a2 -<<Sfi 0^3, from the induction hypothesis 
we obtain ai 0:3. This also leads to ai ~<<ir a^. 

□ 

Definition 12 

Given a general preference ^, we say that a trajectory a is most preferred if there is 
no trajectory that is preferred to a. 

The next example highlights some differences and similarities between basic desires 
and general preferences. 

Example 7 

Let us consider the original action theory presented in Section |21 with the action 
buy-Coffee and a user having the goal of being at the school and having coffee. In- 
tuitively, every trajectory achieving the goal of the user would require the action of 
going to the coffee shop, buying the coffee, and going to the school thereafter. 

Let us consider the following two preferences (similar to those discussed in Example 

gi: 

time — always (occ(5uy_co//ee) V {takeJaxi (drive V 6ms) <^ walk)) 

and 

cost = always(occ(6u2/_co//ee) V {walk {drive V bus) takeJtaxi)). 

It is easy to see that most preferred trajectories with respect to time should contain 
only the actions buy_coffee and take_taxi while most preferred trajectories with respect 
to cost should contain only the actions buy-coffee and walk. 
Consider the two preferences: 

^Pi = time A cost 

and 

^^2 — time & cost. 

Observe that ^'i is a basic desire while ^'2 is a general preference. It is easy to see that 
there are no trajectories satisfying the preference ^E*!. Thus, every trajectory achieving 
the goal is a most preferred trajectory w.r.t. ^'i. At the same time, we can show that 
for every pair of trajectories a and /3, neither a -^^^ (3 nor /3 a holds. 
Let us now consider the two preferences: 

5*3 — time V cost 
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and 

^'4 = time I cost 

with respect to the same set of trajectories. Here, \l/3 is a basic desire while 
is a general preference. We can see that any trajectory containing the actions taxi 
and walk would be most preferred with respect to ^^3. All of these trajectories are 
indistinguishable. For example, the trajectory 

a = So walk{home, cof feeshop) si huy_cof fee S2 walk{cof fee_shop, school) S3 

and the trajectory 

f3 = So walk{home, cof feeshop) s'l huy-cof fee s'2 take Jaxi{cof feeshop, school) S3 

are indistinguishable with respect to ^'3. On the other hand, we have that a -<cost P 
(the minimal cost action is always used) and a ^ume (the fastest action is not used 
every time). This implies that a -<^^,^ (3. 
Let us consider now the preference 

^'5 =! time. 

It is easy to see that \l/5 is equivalent to cost in the sense that every most preferred 
trajectory w.r.t. is a most preferred trajectory w.r.t cost and vice versa. □ 

The next proposition is of interest for the computation of preferred trajectories^. 

Proposition 5 

Let '^i, '^2 and '^z be three general preferences, 4* = ^'i < \E'2 < '^3, and F = 
1 < (^'2 < ^'a). For arbitrary trajectories a and /?, the following holds: 

• a 13 if and only if a ~r /?; and 

• a (3 if and only if a -<t P- 

Proof 

• We have that a /? iff a w^. /3 for i e {1,2,3} iff a /3 and a w^^^^g /? iff 
a «r P- 

• Since a /? iff there exists i e {1,2,3} such that a /? for 1 < j < i and 
a -<^. P, we have three cases: i = 1, 2, or 3. We consider these three cases: 

— i = 1. This implies immediately that a -<r P- 

— i = 2. This means that a P and a -K^^ p. This in turn implies that a Ki^^ P 
and a -<*2<]>p3 P, i.e., a -<r P- 

— i = 3. This is similar to the case i = 2. 

Thus, we have that if a P then a -<r P- The proof of the reverse is similar. 
□ 

We would like to thank the anonymous reviewer who pointed out that this proposition is necessciry 
for the computation presented in the next section. 
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4 Computing Preferred Trajectories 

In this section, we address the problem of computing preferred trajectories. The abihty 
to use the operators A, V as well as &, |, ! in the construction of preference formulae 
allows us to combine several preferences into a preference formula. For example, if a 
user has two atomic preferences 4* and but does not prefer ^ over $ or vice versa, 
he can combine them in to a single preference 4'A$<l^'V<i>< -i^ A -i<I>. The same 
can be done if ^E" or $ are general preferences. Thus, without loss of generality, we can 
assume that we only have one preference formula to deal with. We would like to note 
that this way of combination of preferences creates a preference whose size could be 
exponential in the number of preferences. We believe, however, that it is more likely 
that a user — when presented with a set of preferences — will have a preferred order on 
some of these preferences. This information can be used to create a single preference 
with a reasonable and manageable size. 

Given a planning problem {D, I, G) and a preference formula Lp, we are interested 
in finding a most preferred trajectory a achieving G w.r.t. the preference ^p. We will 
show how this can be done in answer set programming. 

A naive encoding could be realized by modeling (Z?, /, G) in logic programming 
(following the scheme described in l|Son et al. 20 0511. using an answer set engine to 
determine its answer sets — and, thus, the trajectories — and then filtering them ac- 
cording to ip. 

In the rest of this section, instead, we present a more sophisticated approach which 
allows us to determine a most preferred trajectory. We achieve this by encoding each 
basic desire ip as a. set of rules 11;^ and by developing two sets of rules Ilsat and lipref ■ 
The program lisat checks whether a basic desire is satisfied by a trajectory. On the 
other hand, lipref consists of rules that, when used together with the maximize 
construct of smodels, allow us to find a most preferred trajectory with respect to a 
preference formula. Since 11(13, /, G) has already been discussed in Section 13 we will 
start by defining 11;^. 

4-1 n^; Encoding of Basic Desire Formulae 

The encoding of a basic desire formula is similar to the encoding of a fluent formula 
proposed in l|Son et al. 2005)l . In our encoding, we will use the predicate desire as a 
domain predicate. The elements of the set {desire{l) | Hs a fluent literal} belong to 
W-^p. Each atom in this set declares the fact that each literal is also a desire. Next, 
we associate to each basic desire formula a unique name n^p. If (/? is a basic desire 
formula then it will be encoded as a set of facts, denoted by 11^. This set is defined 
inductively over the structure of pfi . The encoding is performed as follows. 

— If (/9 is a fluent literal I then 11^ = {deszre(Z)}; 

— li p — goal{(j)) then 11;^ = II^ U {desire{n^p), goal{nci,)}; 

— li p = occ{a) then 11^ = {desire{n^) , happen{nip, a)}; 

— li if ^ (pi A (p2 then 11^ = 11;^^ U H^^ U {desire{n^),and{n^,n^^,n^^)}; 

* To simplify this encoding, we have developed a Prolog program 
that translates ip into H^j. This program can be downloaded from 
http: //www, cs .nmsu. 6du/~tson/ASPlan/Pref erences/ coiiv_form.pl 
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— If (f = ipi V (p2 then = n^^^ un^2 U {desire{n^),or{n^,n^^,ny,2)}; 

— li ip = -!(/) then 11,^ = U {desire{n^),negation{nip, n^)}; 

— If = next((?!)) then IIj^ = 11^ U {desire{n^),next{n^,n^)}; 

— If (fi = until(i^i, (/92) then 11;^ = n<^j U XI^^ U {(iesire(n^), until{n^, n^^,n^^)}; 

— If (fi = always((?!)) then = II^ U {desire{n^), always{n^. r?^)}; 

— If (yj = eventually(0) then 11;^ = II^ U {desire{n^), eventually {n^, n^)}. 

It is worth noting that, due to the uniqueness of names for basic desires, will not 
occur in 11$ \ {desire{n^)} . 

4-2 Tlgat- Rules for Checking of Basic Desire Formula Satisfaction 

We now present the set of rules that checks whether a trajectory satisfies a basic desire 

formula. Recall that an answer set of the program I1{D, I, G) will contain a trajectory 
where action occurrences are recorded by atoms of the form occ{a, t) and the truth 
value of fluent literals is represented by atoms of the form holds{f,t), where a G A, 
/ is a fluent literal, and t is a time moment between and length. The program Ilsat 
defines the predicate satis fy{F,T), where F and T arc variables representing a basic 
desire and a time moment, respectively. The satisfiability of a fluent formula at a 
time moment will be defined by the predicate h{F, T) — which builds on the previous 
mentioned predicate holds and the usual logical operators, such as A, V, Intuitively, 
satis fy{F, T) says that the basic desire F is satisfied by the trajectory starting from 
the time moment T. The rules of Ilsot are defined inductively on the structure of F 



and are given next. 

satis fy{F,T) <— desire{F),goal{F), satis fy{F, length). (1) 

satis fy{F,T) ^ desire{F),happen{F,A),occ{A,T). (2) 

satis fy{F,T) ^ desire{F),literal{F),holds{F,T). (3) 

satis fy{F,T) ^ desire{F) , and{F, Fi , F2) , (4) 

satisfy{Fi,T), satis fy{F2, T). 

satis fy{F,T) ^ desire{F), or {F, Fi, F2), satis fy{Fi,T). (5) 

satis fy{F,T) ^ desire{F),or{F,Fi,F2), satis fy{F2,T). (6) 

satis fy{F,T) <— desire{F),negation{F, Fi), not satis fy{Fi,T). (7) 

satisfy{F,T) ^ desire{F),until{F, Fi, F2),T < Ti, (8) 

during{Fi,T,Ti - 1), satis fy{F2,Ti). 

satis fy{F,T) ^ desire{F),until{F,Fi,F2), satis fy{F2,T). (9) 

satis fy{F,T) <— desire{F),always{F,Fi),during{Fi,T, length). (10) 

satis fy{F,T) ^ desire{F),next{F, Fi), satis f y{Fi, T +1). (11) 

satis fy{F,T) ^ desire{F),eventually{F,Fi),T <T1, (12) 

satisfy{Fi,Tl). 

during{F,T,Ti) ^ T < Ti, desire{F), satis fy{F,T), (13) 

during{F,T + l,Ti). 

during{F,T,T) ^ desire{F), satis fy{F,T). (14) 
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In the next theorem, we prove the correctness of Ti^at- We need some additional 
notation. For a trajectory a = s^ai . . . a„s„, let 

a^^ = {occ{ai, ? — 1) I i e {1, . . . , n}} U {holds{f, i) | / G Si, i G {0, . . . , n}}. 

We have: 

Theorem 2 

Let {D, I, G) be a planning problem, a = sgai . . . a„s„ be a trajectory, and he a, 
basic desire formula. Then, for every Q <t < length and every basic desire formula 
77 with desire{nri) £ H,^, 

a[t] h iff U lisat U {a)-^ h satis fy{n^,t). 

(Recall that a[t] is the trajectory Sjat+i . . . a„s„.) 

First, we prove that the program 11 — 11^ U Usat U {a)~^ has only one answer set. It 
is well-known that if a program is locally stratified then it has a unique answer set 
| |Apt et al. 1988||Przymusinski 1988J . Wc will show that 11 (more precisely, the set of 
ground instances of rules in 11) is indeed locally stratified. To accomplish this we need 
to find a mapping A from literals of the grounding of 11 to N that has the property: if 

Aq ^ Ai,A2, . . . An, not Bi,not B2, ■ ■ ■ not Bm 

is a rule in 11, then A(Ao) > \{A{) for aU 1 < i < n and \{Aq) > X{Bj) for all 
1 < j < m. 

To define A, we first associate a non-negative number cr((^) to each constant as 
follows. Intuitively, cr(0) represents the complexity of (p. 

a{l) = if Hs a literal. 

a{n,f) = if 77 has the form occ{a). 

a(n^) — (t(7i^j^) -|- 1 if 77 has the form -1771, next(77i), eventually(7;i), or always(77i). 
cr(7i^) — max{(T(rj,^), (t(77,,2)} + 1 if 77 has the form 771 A772, 771 V772, or until(77i, 772). 
a{n,^) = cr(77,,J if 77 = goal(77i). 

We define A as follows. 

X{satisfy{njj, t)) — 5 * a{r/) + 2, 
X{during(nri, t, t')) = 5 * 17(77) -I- 4, and 
= for every other literal of 11. 

We need to check that A satisfies the necessary requirements. For example, for the 
rule CQ), we have that X{satisfy{nrj, t)) = \{satisfy{nri, length)) = 5* a(nri) + 2 and 
X{satisfy{nri,t)) > 2 > = X{goal{nr^)) — X{desire{nr,)). For the rule 0, we have 
that X{satisfy{njj,t)) — 5 * cr(n^) -f 2 = 5 * {a{nrji) -|- 1) + 2 > 5 * {o'{njj^)) + 2 = 
X{satisfy(nrn, length)). The verification of this property for other rules is similar. 
Thus, we can conclude that 11 has only one answer set. Let X be the answer set of 11. 
We prove the conclusion of the theorem by induction over ij{n,-i). 

Base: Let 77 be a formula with 17(71^) — 0. By the definition of cr, we know that 77 is 
a fluent literal or of the form occ{a). If 7/ is a literal, then 77 is true in Sj iff 77 is in s^, 
that is, iff holds{ri,t) belongs to X, which, because of rule ©, proves the base case. 
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If ?7 = occ{a) then we know that happen{n.^,a) G H^. Thus, satis fy{n,-i,t) £ X iff 
occ{a, t) £ X (because of the rule ((SJ) iff a[t] \= rj. 

Step: Assume that for all < j < /c and formula r] such that a{nri) — j, a[t] ^ 77 iff 
satis fy{nrf, t) is in X. 

Let Tj be such a formula that cr(n^) = fc + 1. Because of the definition of cr, we have 
the following cases: 

Case \: rj — -ir/i. We have that a{nri^ ) = cr(n^) — 1 = fc. Since negation{nrj, nri^) £ X, 
satis fy{njj,t) £ X iff the body of rule |(7J) is satisfied by X iff satis fy{nri^,t) ^ X iff 
a[t] rji (by the induction hypothesis) iff a[t\ [= rj. 

Case 2: ry = 771 A r/2. Similar to the first case, it follows from the rule and the 
facts desire{nri) and and{nri,nr,^,nr,2) in that satis fy{nr,,t) £ X iff the body of 
rule Q is satisfied by X iff satis fy{nri^^t) £ X and satis fy{nri2,t) £ X iff a[t] ^ rji 
and a[t] ^ 772 (induction hypothesis) iff a[t] |= 77. 

Case 3: 7; = 771 V 772- The proof is similar to the above cases, relying on the two rules 
(01, ®, and the facts that desire{nrj) £ and or(7i^, n^^ , n^^) £ 11^. 
Case 4: 77 = until(7;i, 772). We have that o'(n^j) < A; and ^(tt,^^) < fc. Assume that 
Q[t] ^ 77. By Definitional there exists t < t2 < n such that Q:[t2] H ^72 and for 
all t < ti < <2, Q:[^i] H By the induction hypothesis, satis fy{nri2,t2) £ X and 
satis fy{nji^,ti) £ X for t < < i2. It follows that during{nrj-^,t,t2 — i) 6 Because 
of rule 0-©, we have satis fy{n,^,t) £ X. 

On the other hand, if satis fy{nrj, t) £ X, because the only rules supporting satis fy{nrj, t) 
are (jH!-©, there exists t2, t < t2 < length, and during {nrij^,t,t2 — 1) £ X, and 
satis fy{nji2, 12). It follows from during {njj^,t,t2 — I) E X that satis fy(nri^,ti) £ X 
for all t < ii < t2- By the induction hypothesis, we have a[ti\ \^ rji for all t < ti < t2 
and a[t2] \= 772. Thus a[t] ^ until(77i, 772), i.e., a[i] \= 77. 

Case 5: 77 = next(77i). Note that <j{nr,^) < k. Rule H1I|) is the only rule supporting 
satis fy{nri,t) where 77 = next(77i). So satis fy{n,j,t) £ X iff satis fy{nr,^,t + f ) £ X 
iff a[t + 1] ^771 iff a[t] \= next(77i). 

Case 6: 77 = always(7/i). We note that a{n,-i^) < fc. Observe that satis fy{n,-i,t) is 

supported only by rule (|f 0(1 . So satis fy{n,j,t) £ X iff during[nri^,t,n) £ X. The 

latter happens iff satis fy{n^-^,ti) £ X for all t < ti < n, that is, iff ^ 771 for all 

t <ti < n which is equivalent to a[t] \= always(77i), i.e., iff a[t] \= rj. 

Case 7: rj = eventually(77i). We know that satis fy{njj,t) £ X is supported only by 

rule ((12|l . So satis fy{n.q, t) £ X \S there exists t < ti < n such that satis fy{n,j-^ , ^i) G 

X. Because (7(77,^^) < fc, by induction, satis fy{n.r,,t) £ X iff there exists t < ti < n 

such that a[ti] \= rji, that is, iff a[t] \= eventually(7/i), i.e., iff a[t] ^ 77. 

Case 8: 77 = goal(77i). Since rji does not contain the goal operator, it follows from 

the above cases that satis fy{n^-^^,n) £ X iff a[n] ^ 771. From the rule ©, we can 

conclude that satis fy{njpt) £ X iff satis fy{nr,^,n) £ X iff a[t] |= 77. 

The above cases prove the inductive step and, hence, the theorem. □ 

The next theorem follows from Theorems and (|2J). 

Theorem 3 

Let {D, I, G) be a planning problem and <p be a basic desire formula. For every answer 
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set M of the program W[D, I, G, ip) = U{D, I, G) U U H 



-sat 5 



ctAi H iff satis fy{ntp,0) £ M. 



where aM denotes the trajectory achieving G represented by M. 



Proof 



Let S be the set of hterals of the program Il{D, I, G). It is easy to see that for every 
rule r in Il{D,I,G,ip) if the head of r belongs to S then every literal occurring in 
the body of r also belongs to S. Thus, S' is a splitting set of Il{D, I, G, (p). Using the 
Splitting Theorem l|Lifschitz and Tiirner IDQljl . we can show that M is an answer set 
of U{D,I,G,(p) iff M = X U y, where X is an answer set of U{D,I,G) and Y is 
an answer set of the program 11^ U Usat U {aM)~^- It follows from Theorem |21 that 
c^j\/[0] 1= iff satis fyi^n^p, 0) G F iff satis fy{n^, 0) G M . □ 

The above theorem allows us to compute a most preferred trajectory using the smod- 
els system. Let n(Z3, /, G, tp) be the program consisting of the n(_D. /. G) U li^p U lisat 
and the rule 



Note that rule (|15|l represents the fact that the answer sets in which satis fy{n^,0) 
holds are most preferred, smodels will first try to compute answer sets of H in which 
satis fyljitp, 0) holds; only if no answer sets with this property exist, other answer sets 
will be considered.^ 

Theorem 4 

Let {D, I, G) be a planning problem and iphe a, basic desire formula. For every answer 
set M of n(Z?, /, G. (fi), if satis fy{n^, 0) £ M then um is a most preferred trajectory 
w.r.t. if. 

Proof 

The result follows directly from Theorem^ satis fy{n^, 0) G M implies that ajv/ H 
hence aM is a most preferred trajectory w.r.t. p. □ 

The above theorem gives us a way to compute a most preferred trajectory with respect 
to a basic desire. We will now generalize this approach to deal with atomic and general 
preferences. The intuition is to associate to the different components of the preference 
formula a weight; these weights are then used to obtain a weight for each trajectory, 
based on what components of the preference formula are satisfied by the trajectory. 
The maximize construct in smodels can then be used to compute answer sets with 
maximal weight, thus computing most preferred trajectories. The weight functions are 
defined as follows. 



The current implementation of smodels has some restrictions on using the maximize construct; 
our jsmodels system can now deal with this construct properly. 



maximize{satis/?/(n;^, 0) = l,not satis fy{nip,0) — 0}. 



(15) 
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Definition 13 

Given a general preference a weight function w.r.t. (or weight function, for 
short, when it is clear from the context what is assigns to each trajectory a a 
non- negative number w^s,{a). 

Since our goal is to use weight functions in generating most preferred trajectories 
using the mELximize construct in smodels, we would like to find weight functions 
which assign the maximal weight to most preferred trajectories. In what follows, we 
discuss a class of weight functions that satisfy this property. 

Definition I4 

Let be a general preference. A weight function is called admissible if it satisfies 
the following properties: 

(i) if a -<ii, f3 then w^l,(a) > w>j,{f3): and 

(ii) if a j3 then w,i,(a) = w>i.(/3). 

It is easy to see that if w^s is admissible then the following theorem will hold. 

Proposition 6 

Let \E' be a general preference formula and w^{a) be an admissible weight function. 
If a is a trajectory such that w^s [a) is maximal, then a is a most preferred trajectory 
w.r.t. 

Proof 

Since w^(a) is maximal and w^^ is admissible, we conclude that there exists no tra- 
jectory /3 such that /3 a. Thus, a is a most preferred trajectory w.r.t. ^. □ 

The above proposition implies that we can compute a most preferred trajectory using 
smodels if we can implement an admissible weight fimction. This is the topic of the 
next section. We would like to emphasize that the above result states soundness of this 
method, but not completeness. This means that the computation scheme proposed in 
the next section will return a most preferred trajectory, if the planning problem admits 
solutions. This is consistc^nt and in line with the practice used in many well-known 
planning systems, in which only one solution is returned. 



Let ^' be a general preference. We will now show how an admissible weight function 
w^!! can be built in a bottom-up fashion. We begin with the basic desires. 

Definition 15 {Basic Desire Weight) 

Let be a basic desire formula and let a be a trajectory. The weight of the trajectory 
a w.r.t. the basic desire <^ is a function defined as 



The following proposition derives directly from the definition of admissibility. 



4-3 Computing An Admissible Weight Function 




otherwise 
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Proposition 7 

Let Lphe a, basic desire. Then is an admissible weight function. 

The weight function of an atomic preference builds on the weight function of the basic 
desires occurring in the preference as follows. 

Definition 16 [Atomic Preference Weight) 

Let ■(/' = (pi < (/52 <1 • ■ ■ <3 yj/c be an atomic preference formula. The weight of a trajectory 
a w.r.t. ijj is defined as follows: 

k 
r=l 

Proposition 8 

Let = V'l <1 ¥^2 <1 • • • < V'fe be an atomic preference formula, and let ai, a2 be two 
trajectories. Then 

ai <^ 0.2 iff wv('^i) > w^(a2) 
Furthermore, we also have that 

ai «v iff w^{af) = w^{a2). 

Proof 

Let us start by assuming ai 02- According to the definition, this means that 
3(1 < i < k) such that 

• V(l <j< i){ai Ki^. 02) 

• ai -<^p. a2 

From Proposition|7|we have that ai «(^^. a2 implies w^p. (ai) = w^- (012) for 1 < j < i. 
This leads to 

r— 1 r— 1 

In addition, since ai ^^p- 0.2-, then we also have that 1 = Wif,^{cx\) > Wp.{a2) = 0. 
Thus, we have 

j:l~Jii2'-' X WpA^^)) + 2^-^ + Et.+i(2'-'- X WpMi)) > 
ErJii^"'' X WpMi)) + 2'-' - 1 > 
ETJii^"-' X WpMi)) + Et,;+i(2'-'^ X Wp^{a2)) 
Eti(2'-'^ x«;^^(a2)) 

For similar reasons, it is also easy to see that ai 02 implies w^{ai) = w^{a2)- 

Let us now explore the opposite direction. Let us assume that w^(q;i) > w^(q!2). 
It is easy to see that there must be an integer «, 1 < « < fc, such that ai ^^^p. a2- 
Consider the minimal integer i satisfying this property, i.e., ai a2 for every j, 
1 < j < Since Wji,{ai) > i(;^(q;2) and ai 9^;^. a2 we conclude that ai -<ip- 02- This 
implies that ai a2- 

Similar arguments allow us to prove that if Wf{ai) = w^(a2) then ai k,^ a2- D 

We are now ready to define an admissible weight function w.r.t. a general preference. 
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Definition 17 {General Preference Weight) 

Let 5' be a general preference formula. The weight of a trajectory a w.r.t. ^E" (denoted 
by Wxi,{a)) is defined as follows: 

— if is an atomic preference then the weight is defined as in Definition 1161 

— if ^' = ^i&\E'2 then W'},{a) = w\s,-^{a) 

— if 5* = ^'i I \E'2 then u;^ (a) — w^^J-^ (a) + w^i^ (a) 

— if ^' = ! 'fi then w>j,{a) — max{'^i) — w<s,-^{a) where max{'i!i) represents the 
maximum weight that a trajectory can achieve on the preference formulae plus 
one. 

— if V]/ — v|/^ < VI/2I0 then w^{a) — max{'^2) x wm^{a) + w<i,^{a) 
We prove the admissibility of w^, in the next proposition. 
Proposition 9 

For a general preference 4*, w^i, is an admissible weight function. 

The proof of this proposition is based on Propositions I7I8I and the structure of ^P. It 
is omitted here for brevity. The above propositions allow us to prove the following 
result. 

Proposition 10 

Let \E' be a general preference and a be a trajectory with the maximal weight w.r.t. 
. Then, a is a most preferred trajectory w.r.t. 5*. 

Proof 

Let (a) be maximal; let us assume by contradiction that there exists f3 such that 
/3 -<^, a. It follows from the previous proposition that [3 a implies that w^,{(3) < 
w^lr{a), which contradicts the hypothesis that ^^(a) is maximal. □ 

Propositions I7I9I show that we can compute an admissible weight function 
bottom- up from the weight of each basic desire occurring in ^. We are now ready 
to define the set of rules Uprefi'^), which consists of the rules encoding 5" and the 
rules encoding the computation of w^. Similarly to the encoding of the basic desires, 
we will assign a new, distinguished name to each preference formula (f> occurring in 
^ and encode the preferences in the same way as we did for the basic desires (Section 
I4.1|l . We will also add an atom preference{n^) to the set of atoms encoding (f>. For 
brevity, we omit here the details of this step. The program npre/(^') defines two pred- 
icates, u'(p, n) and max{p, n), where p is a preference name and n is the weight of the 
current trajectory with respect to the preference named p. 'w{p,n) (resp. max{p,n)) 
is true if the weight (resp. maximal weight) of the current trajectory with respect to 
the preference p is n. 

1. For each basic desire (j), Uprefift)) contains the rules encoding and the following 
rules: 

w{n^,l) ^ satisfying, 0) 

w{ncj,,0) ^ not satisfying, 0) (16) 
max{n^,2) ^ 

^'^ Because of Proposition |5] without loss of generality, we describe the encoding only for chains of 
length 2. 
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2. For each atomic preference (j) — Lpi <\ <^ ■ ■ ■ <i fk, ^pref{(f) consists of 
and the following rules: 

3. For each general preference 

• if ^' is an atomic preference then Uprefi'^) is defined as in the previous item. 

• if vl/ = '^ik.'^2 or * = *i|*2 then Uprefi'^) consists of 



npre/(^l)Unp^e/(*2) 



and 



max{n^i,,S) ^ max{nii,^, Ni)^max{nii,^, N2), S = Ni + N2 
if ^ =! then liprefi'^) consists of liprefi'^i) and the rules 

w{n\[r,S) <— w{nx},^, N),max{nqj^, M), S — M ~ N .^^^ 
max{n^!,S) <— max{n\s,^, S). 



if * = *i < *2 then np„/(*) consists of Ilprefi'^i) U Tlpref{^2) and rules 

(20) 



'w{n^s,,S) ^ w{n^^,^,Nl),w{n^s,^,N2), 

■max{n^^ , M2), 5 = M2 * Ni + N2 
'max{n\s!, S) ^ 'max{nqj^, Ni),max{nx},2, N2), 
S ^ N2'^Ni+N2 



The next theorem proves the correctness of Uprefi'^)- 
Theorem 5 

Let (D, I, G) be a planning problem, \1/ be a general preference, and a = SqAi . . . a„s„ 
be a trajectory. Let E = Ilpre/C*) U 'Rgat U . Then, 

• For every desire </> with desire{n^) 6 Tlprefi"^), we have that H ^ w{n^,w) iff 
Wci>{a) — w and H ^ max{n^, w) iff max{4>) — w. 

• For every preference r/ with preference{nri) £ Hpre/l^), we have that IT \= 
w{nri,w) iff Wri{a) = w and 11 ^ max{nri,w) iff max{rj) — w 

where 

Q^^ = {occ(ai, j — 1) I i G {1, . . . , 71}} U {holds{f, i) \ f £ Si,i £ {0, . . . , n}}. 
Proof 

Let Hi be the program consisting of the rules H16|l - (|2()|l of the program Uprefi'^) and 
the set of atoms of the form preference{n^) in T\pj.f,f{^). Let 5 be the set of literals 
occurring in the program 11 \ Hi. It is easy to check that 5* is a splitting set of H. 
Using the Splitting Theorem ULifschitz and Turner 1994|l . we can show that M is an 
answer set of H iff M = X U F, where X is an answer set of the program H \ Hi and 
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Y is an answer set of the program 112, which is obtained from Hi by replacing the 
rules of the form (|16(l with the set of atoms Z where 

^ = UdestreM^x {"'K. 1) I satisfy{n^, 0)eX} U 

{w{n^, 0) I satisfying, 0) ^ X} U (21) 
{max{n^, 2)}. 

Observe that for a desire cj) with desire{n^) S Wprefi^) we have that II^ C Ilpre/l^)- 
By applying the results of Theorem |21 we have that 11 \ Hi has a unique answer set 
X and satis fy{ncf),0) ^ X iS a \= cj). Together with the fact that Z C Y, we have 
that w{n^, 1) G Af iff a ^ iff w^{a) = 1 and w{n^, 0) G A/ iff a ^ </> iff ^^(a) = 0. 
Furthermore, max{n^, 2) G M and w^{a) < 1 for every desire (j). This proves the first 
item of the theorem. 

We will now prove the second item of the theorem. To account for the structure of 
the preference, we associate an integer, denoted by A(ri0), to each constant such 
that preference(n^) G Tlprefi"^) or desire{n^) G Tlprefi"^)- This is done as follows: 

A(ri0) = if desire(n^) G Uprefi^) (i-e., if (/> is a desire); 

\{n^) = 1 if preference(n^) G Iipref{^) and 4> = ipi <\ ip2 < ■ ■ ■ < ^k', 

\{n^) — X{n^J + \{n^^) + 1 if preference{n^) G Uprefi'^) and (p = 0i&</'2 or 

(j) = (pi \ (j)i', and 

\{n^) — \{n^^) + 1 if preference{n^) G Ilprefi'^) and if) —If/)!- 
The proof is done inductively over A(n0). 

Base: X{n^) = means that is a desire. The claim for this case follows from the 
first item. 

Step: Assume that we have proved the conclusion for X{n^) < k. We will now prove it 
for X{n^) — k. Consider a preference with X{n^) = k. We have the following cases: 

— (j) — (fii <i ip2 <i ■ ■ ■ <i fk and ipi are basic desires, i.e., </> is an atomic preference. 
By definition, we have that cpi^s are desires. It follows from (I17|l that 

w{n^, s) G F iff the body of the first rule in (|17() is satisfied by Y 
iff s = Yj\^-^2^^^ X Wr and w{nip.,Wr) G Z for 1 < i < fc 
iff s = w^{a). 

Furthermore, max{n^, 2'^) G Y and the maximal weight of w^{a) is EJ;^]^2'^^'' = 
2'"' — 1. This proves the inductive step for this case. 

— 4> — (f>i&z(f>2 (resp. (/) = (/>! I 02)- We have that X{n^^) < k and Xin^^) < ^- The 
conclusion follows immediately from the induction hypothesis, the rules in H18|l . 
and the definition of w^{a). 

— =l(f>i. Again, we have that X{n^-^) < k. Using H19() and the definition of w^, we 
can prove that w{n^, s) G 1" iff Wfj,{a) — s and max{n^, s) G K iff max{(f)) = s 

— = </>!< 02 • Again, we have that X{n^-^) < k and Xln^p^) < k. The conclusion 
follows immediately from H2()|) . the induction hypothesis, and the definition of 

□ 

The above theorem implies that we can compute a most preferred trajectory by (i) 
adding Uprefi'i') U Ylgat to U{D,I,G) and (ii) computing an answer set M in which 
w{n\sr, w) is maximal. A working implementation of this is available in jsmodels. 
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4.4 Some Examples of Preferences in W 

We will now present some preferences that are common to many planning problems 
and have been discussed in IjEiter et al. 2003|l . The main difference between the en- 
coding presented in this paper and the ones in IjEiter et al. 2003|l lies in that we 
use temporal operators to represent the preferences, while action weights are used 
in IjEiter et al. 2003|l . Let {D,I,G) be a planning problem. For the discussion in this 
subsection, we will assume that the answer set planning module n(D,/, G) is capa- 
ble of generating trajectories without redundant actions in the sense that no action 
occurrence is generated once the goal has been achieved. Such a planning module 
can be easily obtained by adding a constraint to the program Tl{D, I, G) preventing 
action occurrences to be generated once the goal has been achieved. This, however, 
does not guarantee that the planning module will generate the shortest trajectory if 
n is greater than the length of the shortest trajectory. In keeping with the notation 
used in the previous section, we use ip to denote G (i.e., (p = G). 

4-4- 1 Preference for shortest trajectory - formula based encoding 

Assume that we are interested in trajectories achieving ip whose length is less than or 
equal n. A simple encoding that allows us to accomplish such goal is to make use of 
basic desires. By next*((/3) we denote the formula: 

next(next(next • • • (next((p)) • • •)). 

i 

Let us define the formula (T^{(p) (0 < i < n) as follows: 

(T°((/3) = {(p) ^ ^nexV {(fi) A next'((/3) 

Finally, let us consider the formula short(n, ip) defined as 

short{n, (p) = cr°{(p) <l a^{(p) < a"^ {ip) < ■ ■ ■ < cr"('^). 

Intuitively, this formula says that we prefer trajectories on which the goal ip is satisfied 
as early as possible. It is easy to see that if a is a most preferred trajectory w.r.t. 
short{n, ip) then a is a shortest length trajectory satisfying the goal ip. 

4-4-2 Preference for shortest trajectory - action based encoding 

The formula based encoding shorten, ip) requires the bound n to be given. We now 
present another encoding that does not require this condition. We introduce two 
additional fictitious actions stop and noop and a new fluent ended- The action stop 
will be triggered when the goal is achieved; noop is used to fill the slot so that we can 
compare between trajectories; the fluent ended will denote the fact that the goal has 
been achieved. We add to the action theory the propositions: 

stop causes ended 
stop executable if ip 
noop causes ended 
noop executable if ended 
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Furthermore, we add the condition -tended to the executabihty condition of every 
action in [D, I) and to the initial state I. We can encode the condition of shortest 
length trajectory as follows. Let 

short = always((si:op V noop) (ai V ... V flfe)). 

where ai, . . . , a^- are the actions in the original action theory. Again, we can show that 
any most preferred trajectory w.r.t. short is a shortest length trajectory satisfying the 
goal (p. Observe the difference between shorten, ip) and short: both are built using 
temporal connectives but the former uses fluent formula and the latter uses actions. 
The second one, we believe, is simpler than the first one; however, it requires some 
modifications to the original action theory. 

4-4-'^ Cheapest plan 

Let us assume that we would like to associate a cost c(a) to each action a and de- 
termine trajectories that have the minimal cost. Since our comparison is done only 
on trajectories whose length is less than or equal length, we will also introduce the 
two actions noop and stop with no cost and the fluent ended to record the fact that 
the goal has been achieved. Furthermore, we introduce the fluent sCost{ct) to denote 
the cost of the trajectory. Intuitively, scost{ct) is true mean that the cost of the tra- 
jectory is ct. Initially, we set the value of sCost to (i.e., sCost{Q) is true initially 
and sCost{c) is false for every other c) and the execution of action a will increase the 
value of sCost by c(a). This is done by introducing an effect proposition 

a causes sCost{N + c(a)) if sCost{N) 

for each action a^^ . The preference 

goal(sCost(TO)) < goal(sCost(m -I- 1)) . . . <l goal(sCost(M)) 

where m and M are the estimated minimal and maximal cost of the trajectories, 
respectively. Note that we can have m — Q and AI ~ max{c{a) | a is an action} x 
length. 

5 Related Work 

The work presented in this paper is the natural continuation of the work we pre- 
sented in l|Son and Pontelli 2004a|l . where we rely on prioritized default theories to 
express limited classes of preferences between trajectories — a strict subset of the 
preferences covered in this paper. This work is also influenced by other works on 
exploiting domain- specific knowledge in planning (e.g., IjBacchus and Kabanza 20001 
|Dal Lago, Pistore, and Traverso 2002|ISon et al. 2005|l ). in which domain-specific knowl- 
edge is expressed as a constraint on the trajectories achieving the goal, and hence, is 
a hard constraint. In subsection l5.ll we discuss different approaches to planning with 
preferences which are directly related to our work. In Subsections 15 . 2H5 . 31 we present 

Because of the grounding requirement of answer set solver, this encoding will yield a set of effect 
propositions instead of a single proposition. 
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works that are somewhat related to our work and can be used to develop alternative 
implementation for W . 

5.1 Planning with Preferences 

Different approaches have been proposed to integrate preferences in the planning 
process. An approach close in spirit to the one proposed in this paper has been re- 
cently developed by Delgrande et al. I)2()()4|l . The framework they propose introduces 
qualitative preferences built from two partial preorders, <c and <t, over the set of 
propositional formulae of fluents and actions. Intuitively, 

• 'fii <c 'P2 {choice order) indicates the desire to prefer trajectories that satisfy 
(at some point in time) the formula ip2 over those that satisfy ipi. 

• <Pi <t </32 {temporal order) indicates the desire to prefer trajectories that satisfy 
ifi first and ip2 later in the trajectory. 

Choice preferences are employed to derive an ordering <lc between trajectories as 
follows: given trajectories a, (3 we have that 

a<lcP tffVipe A{a, (3)3ip' e A(/3, a).{ip <c 

where A(7, 7') = {(p e dom{<c) \ l \= VtI' and 7 |= denotes the fact that the 

formula tp is true at one of the states reached by the trajectory 7. The order is made 
transitive by taking the transitive closure of <]c- 

The relation <\c can be easily simulated in VP since <c if' determines the same 
order as eventually (</?') < eventually ((^). This can be generalized as long as <c is 
cycle- free. 

Example 8 

Let us consider the monkey-and-banana example as formulated in ( [Delgrande et al. 2004| ). 
The world includes the following entities: a monkey, a banana hanging from the ceil- 
ing, a coconut on the floor, and a chocolate bar inside a closed drawer. Initially, all 
the entities are in different locations in a room. The room includes also a box that 
can be pushed and climbed on to reach the ceiling and grab the banana. The goal is 
to get the chocolate as well as at least one of the banana or the coconut. The domain 
description includes the following fluents: 

• location{Entity , Location) denoting the current Location of Entity; the do- 
main of Entity is {monkey ^ banana, coconut, drawer, box} and the domain of 
Location is {1, . . . , 5} (denoting 5 different positions in the room). 

• onBox denoting the fact that the monkey is on top of the box. 

• hasBanana denoting the fact that the monkey has the banana. 

• hasCoconut denoting the fact that the monkey has the coconut. 

• hasChocolate denoting the fact that the monkey has the chocolate. 

• DrawerOpen denoting the fact that the drawer is open. 

The action theory provides actions to walk in the room, move the box, climb on and 
off the box, grab objects, and open drawers. The goal considered here is expressed by 
the fluent formula: 

hasChocolate A {hasCoconut V hasBanana) 
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The preference discussed in ( |Delgrande et aL 2004| ) is that bananas are preferred 
over coconuts — i.e., hasCoconut <c hasBanana — and in our framework it can be 
expressed as 

eventually {has Banana) < eventually (/lasCoconut). 
This preference can also be represented by a simpler basic desire 

goal(has Banana) 

which says that trajectories achieving hasBanana will be most preferred. □ 

Temporal preferences are employed to derive another preorder <t between trajectories 
as follows: 

• given a trajectory a — soai . . . flnSn and two propositions over fluent and actions 
If, if' , then If <a If' iff 

— a \^ ip and a \^ ip' and 

— i^ < i^i where Si (resp. s; ,) is the first state in a that satisfies </? (resp. 
^'). 

• given two trajectories a,f3, we have that a <t /? iff <t fl <~^'^^<t H where 

is the inverse relation of <„. 

Each individual temporal preference if <t f' can be expressed in our language as the 
basic desire 

c{ip <t If') = eventually((/j A eventually((^')) A until(-i<y9', ip) 

The generalization to a collection of temporal preferences requires some additional 
constructions. Given a collection of basic desires S — {tpi, ■ • ■ , '0fe}i then 

• for an arbitrary permutation zi, . . . , of 1, . . . , /c, let us define 

ch{S, ii,...,ik)= < /\ iJ^^ <i f\i!^^ <]■■■<] V'zfc . 

i=i i=2 j=3 

Intuitively, ch(S, ii, . . . ,ik) is an atomic preference representing an ordering be- 
tween trajectories w.r.t. the set of basic desires {pij, . . . ,Pi^}- For example, tra- 

k 
3 = 



jectories satisfying A?=i i'i, is most preferred; if no trajectory satisfies V' 



then trajectories satisfying A^=2 V'ij is most preferred; etc. 
• let {tti, . . . , TTfci} be the set of all permutations of 1, . . . , /c; let us define 

maxim{S) = ch{S, tti) | ch{S, 712) \ ■ ■ ■ \ ch{S, irk')- 

Intuitively, maxim{S) indicates that we prefer trajectories satisfying the maxi- 
mal number of basic desires from the set {V^i, . . . , V'fc}- 

If we have a collection of temporal preferences {ipi <t ^'i \ i = 1, . . . ,k}, then the 
equivalent formula is 

maxim{{c{ipi <t f'i) | i = 1, . . . , k}). 
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Example 9 

Let us continue Example |H1 by removing the choice preference and assuming instead 
the temporal preference hasBanana <t hasChocolate — i.e., the banana should be 
obtained before the chocolate. The corresponding encoding in our language is 

eventually {has Banana A eventually {hasChocolate)) A 
until{-'hasC'hocolate, hasBanana) 

□ 

Eiter et al. introduced a framework for planning with action costs using logic pro- 
gramming l|Eiter et al. 2008|l . The focus of their proposal is to express certain classes 
of quantitative preferences. Each action is assigned an integer cost, and plans with 
the minimal cost are considered to be optimal. Costs can be either static or relative 
to the time step in which the action is executed. l|Eiter et al. 20fl3|l also presents the 
encoding of different preferences, such as shortest plan and the cheapest plan. Our 
approach also emphasizes the use of logic programming, but differs in several aspects. 
Here, we develop a declarative language for preference representation. Our language 
can express the preferences discussed in IjEiter et al. 2003|l . but it is more high-level 
and flexible than the action costs approach. The approach in l|Eiter et al. 2003|l also 
does not allow the use of fully general dynamic preferences. On the other hand, while 
we only consider planning with complete information, Eiter et al. IjEiter et al. 2003|l 
deal with planning in the presence of incomplete information and non-deterministic 
actions. 

Other systems have adopted fixed types of preferences, e.g., shortest plans IjCimatti and Roveri 20001 
IBlum and Furst 1997|l . 

Our proposal has similarities with the approach based on metatheories of the plan- 
ning domain ( |Myers 1996| [Myers and Lee 1999| ), where metatheories provide charac- 
terization of semantic differences between the various domain operators and planning 
variables; metatheories allow the generation of biases to focus the planner towards 
plans with certain characteristics. 

Our work is also related to the work in ULin 1998|l in which the author defined 
three different measures for plan quality (A-, B-, and C-optimal) and showed how 
they can be axiomatized in situation calculus. Roughly, a plan is A-optimal if none 
of its actions can be deleted and the remainder is still a valid plan. It is B-optimal 
if none of its segments can be deleted and the remainder is still a plan. It is C- 
optimal if none of its segments can be replaced by a single action and the remain- 
der is still a plan. While these measures are domain- independent, preferences in our 
language are mostly domain-dependent. Theoretically, these measures could also be 
expressed in VP by defining an order among possible plans. This impractical method 
can be replaced by considering some approximations of these measures. As an ex- 
ample, shortest plans as encoded in the previous section represents a class of A- 
optimal plans; the atomic preference ipi < ip2 with tpi = occ{a) A executable(a) and 
(p2 = occ{b) A executable(6) A next(occ(c) A executable (c)) could be used to prefer 
plans with action a over plan containing the sequence b; c; etc. 

Our language allows the representation of several types of preferences similar to 
those developed in | Haddawy and Hanks 19931 ) for decision-theoretic planners. The 
fundamental difference is that we use logic programming while their system is prob- 
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ability based. Our approach also differs from the works on using Markov Decision 
Processes (MDP) to find optimal plans IjPutterman 1994|l : in MDPs, optimal plans 
are functions from states to actions, thus preventing the user from selecting preferred 
trajectories without changing the MDP specification. 

5.2 High-level Languages for Qualitative Preferences 

Brewka recently proposed l|Brewka 2004a|l a general rank-based description language 
for the representation of qualitative preferences between models of a propositional 
theory. The language has similar foundations to our proposal. The basic preference 
between models derives from an inherent total preorder between propositional for- 
mulae (Ranked Knowledge Base); models can be compared according to one of four 
possible comparison criteria — i.e., inclusion, cardinality, maximal degree of satisfied 
formula, and maximal degree of unsatisfied formula. The preference language allows 
the refinement of the basic preference by using propositional combination as well 
as meta-ordering between preferences, in a fashion similar to what described in this 
paper. 

The proposal by Junker l|,Tunker 200 l|l presents a language designed to express 
preferences between decisions and decision rules in the context of a language for solving 
configuration problems. Decisions are described by labeled constraints t : ip, where 
t is a term (possibly containing variables) and if is a configuration constraint. The 
configuration language allows also the creation of named sets of decisions. Preferences 
between decisions are expressed through statements of the form prefer(ti,t2), where 
ti,t2 identify decisions or sets of decisions. The language allows the user also to create 
constraints that assert decisions, thus making it possible to express meta-preferences. 
For example, the following decisions express different preferences (|.Tunker 2001|l : 

decision rule pl(x): 

if x in instances (Customer) and playboy in characteristics (x) 
then preferClook, comfort) 
decision rule p2(x): 

if x in instances (Customer) and age(x)=old 
then prefer (comfort , look) 

and a statement of the type prefer (pi, pi) allows to express a meta-preference. 

5.3 Other Related Works 

Considerable effort has been invested in developing frameworks for expressing prefer- 
ences within the context of constraint programming and constraint logic programming, 
where the problem of inconsistency arises frequently. Most proposals rely on the idea 
of associating preferences (expressed as mathematical entities) to constraints when 
variables are assigned ( |Schiex and Cooper 2G02J . Combinations of constraints lead 
to corresponding combinations of preferences, and the frameworks provide means to 
compare preferences; comparisons are commonly employed to select solutions or to 
discriminate between classes of satisfied constraints. A popular scheme relies on the 
use of costs associated to tuples (where each tuple represent a value assignment), and 
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costs are drawn from a semiring structure IjSchiex et al. 19951 IBistarelli et al. 1997|l . 
which provides operators to combine preferences and to "maximize" preferences. These 
frameworks subsume various approaches to preferences in CSP, e.g., HBorning et al. 1989| 
IMouUn 1 98*81 [Fargier a nd Lang 1993|). 

Schiex et al. IjLarrosa and Schiex 20031 [Schiex and Cooper 2002| ) recently proposed 
the notion of Valued CSP as an algebraic framework for preferences in constraint 
network. In VCSP, costs for tuples are drawn from a value structure {E, ©, where 
E is totally ordered by ^; the maximum denotes total inconsistency. Intuitively, in 
a valued CSP, each constraint cx over a set of variables X is viewed as a function 
that maps tuples of values (values drawn from the domains of the variables X) to an 
element of E (the "cost" of the tuple) . The cost of the constraint allows us to rank 
the "degree" of constraint violation. Given an assignment t for a set of constraints 
C, the valuation of the assignment is the composition of the E values of the in- 
dividual constraints. The objective is to determine an assignment which is minimal 
w.r.t. the order ^. Weighted CSP (WCSP) are instances of this framework, where 

= [0, 1, . . . , fc], ^ is the standard ordering between natural numbers, and © is de- 
fined as a©6 = min{k,a + h} . Extensions of arc-consistency to these frameworks have 
been investigated IjLarrosa 20021 iLarrosa and Schiex 20031 IBistarelli et al. 1997|l . 

Qualitative measures of preference in constraint programming have been explored 
through the notion of Ceteris Paribus Networks (CP-nets) IjBoutilier et al. 1999|l . 

A CP-net is a graphical tool to represent qualitative preferences. Let V be a set of 
variables and let us denote with D{v) the domain of variable w (w G V). A CP-net is a 
pair (G, P), where G is a directed (typically acyclic) graph whose vertices are elements 
of V, while the edges denote preferential dependences between variables; intuitively, 
preferences for a value for a variable v depend only on the values selected for the 
parents of v in the network. For a given assignment of values to the parents of w, the 
CP-net specifies a total order on D{v). An assignment of values 7 to V is immediately 
preferred to the assignment 77 if there is a variable v such that 

• Vu e V \ {v}. j{u) — r/{u) 

• 7(w) is preferred to ri{v) in the ordering of D{v) specified by the assignment 7 
to the parents of v. 

In general, an assignment 7 is CP-preferred to an assignment 77 if there exists a 
sequence of assignments 70, 71, . . . , 7fc such that 

• 70 = 

• 7i is immediately preferred to 7^-1 

• 7fc = 7 

Algorithms for solving constraint optimization problems under preference ordering 
specified by CP-nets have been proposed in the literature IjDomshlak and Brafman 20021 
IBoutilier et al. 2004|l . 

Constraint solving has also been proposed for the management of planning in pres- 
ence of action costs l|Kautz and Walser 1999|l . 

Considerable effort has been invested in introducing preferences in logic program- 
ming. In l|Cui and Swift 2002)l preferences are expressed at the level of atoms and used 
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for parsing disambiguation in logic grammars. Rule- level preferences have been used 
in various proposals for selection of preferred answer sets in answer set programming 
HBrewka and Eiter 1999l|Delgrande et al. 2003|IGelfond and Lifschitz 1998l|Schaub and Wang 2001| ). 
Some of the existing answer set solvers include limited forms of (numerical) optimiza- 
tion capabilities, smodels IjSimons et al. 2002jl offers the ability to associate weights 
to atoms and to compute answer sets that minimize or maximize the total weight. 
DLV IjBuccafurri et al. 2000|l provides the notion of weak constraints, i.e., constraints 
of the form 

^4,..., 4. [w : I] 

where w is a numeric penalty for violating the constraint, and / is a priority level. The 
total cost of violating constraints at each priority level is computed, and answer sets 
are compared to minimize total penalty (according to a lexicographic ordering based 
on priority levels). 

5.4 Alternative Encodings of VV 

In this section we discuss the possibility of implementing W using inference back-ends 
different from smodels or jsmodels. It should be noted that the encoding proposed 
in Section 21 can be translated into dlv code with little effort, while it is not so with 
other answer set programming systems (e.g., cmodels, ASSAT), since they do not 
offer a construct similar to the maximize construct of smodels. 

In this section, we explore the relationships between W and two relatively new 
answer set programming frameworks. Logic Programming with Ordered Disjunctions 
and Answer Set Optimizations. The key idea in both cases, is to show that each 
preference of VV can be mapped to a collection of rules in these two languages. Below 
we provide some details of these languages and their use for expressing VV. 

5.4.1 Logic Programming with Ordered Disjunctions (LPOD) 

Overview of LPOD: In Logic Programming with Ordered Disjunctions ((Brewka et al. 2002|l , 
a program is a collection of ground rules of the form 

Al X • • • X Afc ^ Bi, . . . ,Bn, not Ci, . . . , not Cm 

The literals in the head of the rule represent alternative choices; in the specific case 
of LPOD, the choices are ordered, where Ai is the most preferred choice, while A^ is 
the least preferred one. 

The semantics of a LPOD program P is based on the general idea of answer sets 
and the concept of split of a program. For each rule Ai x ■ ■ ■ x Ak ^ Body, the i*'' 
option of the rule is the standard logic programming clause 

Ai <~ Body, not Ai, . . . , not Ai^i 

A split of the program P is a standard logic program obtained by replacing each rule 
of P by one of its options. 

Given a LPOD program P and a set of ground literals S, then S is an answer set 
of P iff S" is an answer set of a split of P. 

Ordered disjunctions are employed to create a preference order between answer sets 
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of a program. Different ordering criteria have been discussed IjBrewka et al. 2002)1 . 
Given an answer set 5 of a LPOD program P, we say that S satisfies a rule r 

Aix ■ ■ ■ X Ak ^ Body 

• with degree 1 {degs{r) = 1) if 5 ^ Body 

• with degree i {degs{r) = i) if S \= Body and i — min{j \ S \= Aj} 

We denote with S'*(P) = {r e P | degs{r) — i}. The three criteria for comparing 
answer sets under LPOD are the foUowing. Let 5*1, S2 be two answer sets of P; 

• Si is cardinahty preferred to ^2 (5*1 >c S2) iff 3i such that \Si{P)\ = 1 5*2(^)1 
for j < i and \S{{P)\ > \S'^{P)\. 

• 5*1 is inclusion preferred to 5*2 (^i >i S2) iff 3i such that S{{P) — 5*2 (P) for 
j < i and Sl{P) D S'^iP). 

• 5*1 is Pareto preferred to ^2 (^i >p S2) iff 

— 3r e P. degs\ (r) < degs^ {r) 

— ^r' e p. degsAr) > degs^r) 

LPOD allows also the use of meta-preferences between rules of the form ri ^ r2- 
The Pareto preference in this case is modified as follows: Si >p S2 iff 

• 3r e P. degsi (r) < degs2 (r) 

• \fr €z P, if degsi (r) > degs2 (f) then there exists another rule r' such that r' )^ r 
and degs^ir') < degs^ir) 

Translation of our Preferences: Let us start by providing a logic programming en- 
coding of basic desires. We define two entities: Core'^{T) is a unary predicate while 
rules't'lT) is a collection of rules where T is a variable representing time step. 

• if = occ{a) (a £ A) then Core'('{T) = occ{a, T) and rule^' {T) = 0. 

• if = / (/ is a fluent hteral) then Core^{T) = holds{f, T) and rule't'{T) = 0. 

• iiil) = il)iAtl)2 then Core'i'{T) = p^{T) and 

r ^ p'l'{T),notCore'l'AT) 
rule''' {T)^ } ^ (T) , not Core^^ (T) 

i p'I'iT) ^ Core'^AT),Core'^^{T) 

where is a new unary predicate. 

• if ?A = V V'2 then Core'f'{T) = p'^{T) and 

r ^ p'^iT), not Core'^^ (T), not Core'i'^ (T)} 
rule'^iT) = I p'^{T) ^ Core'^^T) 
[ p'f'iT) <- Core^^ lT) 

where p"^ is a new unary predicate. 

• if V' = -V-i then Core'^{T) = p'f'iT) and 



rule^iT) -- 
where p'^' is a new unary predicate. 



p'^(T) 4- not Core'^^T) 
^ p'f'iT), Cor e'^AT) 
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if V' = next(V'i) then Core'f'{T) = p^(T) and 

rule^m - ( ^PHT),notCore^^{T+l) 
rule [i)-<^ p^{T)^Core^^{T+l) 

where is a new unary predicate, 
if V' = always (V'l) then Core^{T) = p'^(r) and 



ile^{T) 



^ fl'{T), not Core'l'^ (Tl), T < Tl, Tl < n 
p^{T) ^ always'^^iT) 
always'^^in) ^ Core'^^{n) 
always"^^ (T) T < n, Core"^^ (T), always"^^ (T + 1) ^ 



where is a new unary predicate. 

if EE eventually (V'l) then Core'^{T) = p'^(T) and 



^ p'^(T), nof Core'^i (T), nof Core'^i (T + 1), . . . , Core^^ {n) 
p-^iT) ^ Core'f'^ (Tl), T <T1,T1 <n 



rule^{T) = 
where p^ is a new unary predicate. 



Let us define as n*(i) = rule^{i) U {Core^{i) x -^Core'^{i)} and let us denote with 
r'^ii) the rule Core'l'{i) x ^Core'^(i). We can show that if {D,I,G) is a planning 
problem and '0 is a basic desire, then the following holds: 

5i >p S'2 ijf niSi) \= -tjj and t:{S2) ^ ip 

where S'l, 5*2 are two answer sets of n(£), /, G)Ull^{i) and Tr{S) denotes the trajectory 
represented by S. 

Let us extend the encoding above to include atomic preferences. In particular, given 
an atomic preference of the form -01 < "02 , we define 

(rl) Core'^'i (i) x not Core'^'^ (i) 
(i) = rule"^' (i) U ru/e'^^ (i ) U <( (r2) Core'/'^ (i) x not Core'i'^ ii) 

rl >~ r2 

A result similar to the one above can be derived: for a planning problem {D, I, G) and 
an atomic preference i), if S'l, 5*2 are two answer sets of n(Z), /, G) U 11''' (i), then 

Si >p S2 iff Tr{Si) <^ ■n[S2). 

The encoding of general preferences in the LPOD framework does not appear to be 
as simple as in the previous cases. The encoding is clearly possible — it is sufficient to 
make use of the encoding presented in Section^] if ip is the preference and max{n^, v) 
is true, then we can introduce the rule 

w(n^, v) X w{n^, w — 1) x ■ • • x w{n^, 1). 

The resulting encoding, on the other hand, is not any simpler than the direct encoding 
in smodels with atom weights. 
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5.4-2 Answer Set Optimization (ASO) 

Overview of Answer Set Optimization: The paradigm of Answer Set Optimization was 
originally introduced by Brewka, Niemela, and Truszczyhski ((Brewka et al. 2003|i) and 
later refined by Brewka IjBrewka 2004b|l . 

In ASO, a program is composed of two parts (Pgen, Ppref), where Pgen is an arbi- 
trary logic program (the generator program) and Ppref is a collection of preference 
rules, used to define a preorder over the answer sets of Pgen- The basic type of rules 
present in Ppref are of the form 

Ci:pi> ...>C„ --Pn^ Body (22) 

where pi are numerical weights while Cj are propositional formulae. The complex 
types of preference rules in Ppref is defined inductively using rules of the form (|22|) 
and the constructors psum, inc, rinc, card, rcard, pareto, and lex. 

For each rule r of the form (|22|l . an answer set S of the program {Pgen, Ppref) yields 
a penalty pen{S, r) which is defined by (i) pen{S, r) = pj where j = min{i | 5 ^ C^} 
if S satisfies Body and at least one Ci, and (ii) pen{S,r) = otherwise. This penalty 
is used in defining a preorder among answer sets of the program as follows. 

Given two answer sets 5*1 , ^2 of an ASO program P, we have that 5*1 is preferred 
to 5*2 (5*1 > 52) w.r.t. a rule r in P of the form (|21l if 

pen{Si,r) < pen{S2,r). 

More complex types of preorder can be described by combining preference rules using 
a predefined set of constructors: 

• {psum ei, . . . , efc), where Si > S2 iff 

k k 

^pen(5i,ej) < y^pen(5'2, Cj) 
4=1 4=1 

• {rinc ei, . . . , e^), where Si > S2 iff 

31<i< k.{Pen\Si) D Pen*(52) A < i.{Pen^{Si) = Pen\S2))) 

or 

VI < i < k.{Pen\Si) = Pen\S2)) 

where Pen^{S) — {j \ pen{S, Cj) = i}. 

• {rcard ei, . . . , e^), where S'l > 5*2 iff 

3l<i< k.{\Pen'{Si)\ > |PenXS'2)| A < i.{\Pen^ {Si)\ = \Pen^{S2)\)) 

or 

VI < i < k.{\Pen\Si)\ = \Pen\S2)\) 

• {lex ei, . . . , Cfe), where Si > S2 iff 

3l<i< k.{Si >, 52 A Vj < i.{Si >j S2)) 

or 

VI < i < k.{Si >^ S2) 
where >i is the preorder associated to the expression e^. 
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• {par eta ei, . . . , e^), where Si > S2 iff 

VI < i < k.{Si >^ S2) 
where each is a preference rule in Ppref- 

Encoding of our Preferences: As for the case of LPOD, the encoding of our preference 
language in ASO is simple for the first two levels (basic desires and atomic preferences), 
while it is more complex in the case of general preferences. 

We will follow an encoding structure that is analogous to the one used in Section 
15.4.11 In particular, we maintain the same definition of Core^{T) and rules'^ (T) . In 
this case, the generator program Pgen corresponds simply to the 11(1?, /, G) program 
that encodes the planning problem. The preference rules employed in the various cases 
are the following: 

• if ipiT) = occ{a) then 

e'^(T) = occ{a,T) > -^occ{a,T) <- . 

• if -ipiT) = f (where / is a fluent literal) then 

e'^iT) EE holds{a,T) > -^holds{a,T) ^ . 

• if V(r) = V'i(T) A V'2(T) then 

e^(r) EE {Core'''' (T) A Core'>"'{T)) > T ^ . 

(where T is a tautology). 

• if 'ilj{T) = ^Ji{T) V ^J2{T) then 

e^{T) = {Core^" (T) V Ccn-e^'{T)) > T ^ . 

• if V(r) = -V'i(r) then 

e^(T) = -nCore^'{T) > Core^'{T) ^ . 

• if tp{T) = next (-01(7)) then 

e^{T) = Core^' (T + 1) > -^Core^' (T + 1) ^ . 

• if ijj{t) = eventually(-0i(t)) (with I < t < n, where n is the length of the desired 
plan) then 

e'^(t) = (Core'^i {t) V Core'''' (t + 1) V ... V Core''" (n)) > T ^ . 

• if ip{t) = always (V'i(t)) then 

e^(i) = {Core'^' {t) A Core''" (t + 1) A ... A Core''" (n)) > T ^ . 

With respect to the original definition of ASO, which allows for a ranked sequence 
of preference programs, atomic preferences of the type ^1 < ^2 can be encoded as 
({e'^i}, {e'^=^}). In the extended ASO model proposed in HBrewka 2004 b|l . the same 
eff'ect can be obtained by using the expression 

{pareto e^' , e^^) 

Only some of the general preferences can be directly encoded without relying on 
the use of numeric weights. 
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• if = 7pi^^2, then we can introduce the expression 

= [par eta e^^e^^) 

• if = ■01 < -02 , then we can introduce the expression 

e'^ = {lex e'^'Se'^^) 

The other cases appear to require the use of weights, leading to an encoding as complex 
as the one presented in Section^ 

For the cases listed above, we can assert the following result: for a planning problem 
(Z?, /, G), a preference -0, and two answer sets 5i, 5*2 of 11(13, /, G), it holds that 

Si >v. 5*2 iff 7r(S'i) <^ tt{S2) 

where is the preorder derived from the expression . 



6 Conclusion and Future Work 

In this paper we presented a novel declarative language, called VV, for the specifica- 
tion of preferences in the context of planning problems. The language nicely integrates 
with traditional action description languages (e.g., B) and it allows the elegant encod- 
ing of complex preferences between trajectories. The language provides a declarative 
framework for the encoding of preferences, allowing users to focus on the high-level 
description of preferences (more than their encodings — as in the approaches based on 
utility functions). VP allows the expression of complex preferences, including multi- 
dimensional preferences. We also demonstrated that TV preferences can be elegantly 
handled in a logic programming framework based on answer set semantics. 

The implementation of the language VV in the jsmodels system is almost complete, 
and this will offer us the opportunity to validate our ideas on large test cases and to 
compare with related work such as that in (|Eiter et al. 2003|l . We would also like to 
develop a direct implementation of the language which can guarantee completeness. In 
other words, we would like to develop a system that can return all possible preferred 
trajectories. 

We also intend to explore the possibility of introducing temporal operators at the 
level of general preferences. These seem to allow for very compact representation of 
various types of preferences; for example, a shortest plan preference can be encoded 
simply as: 

always((occ(stop) V occ{noop)) <\ {occ{ai) V ... V occ{ak)y) 

if ai, . . . , flfc are the possible actions. We also intend to natively include in the language 
preferences like maxim used in Section l5.ll these preferences are already expressible 
in the existing VV language, but at the expense of large and complex preference 
formulae. Furthermore, we would like to develop a system that can assist users in 
defining the preferences given the planning problem. 
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